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1.  INTRODUCTION  AND  SUMMARY 


Many  statistical  problems  connected  with  re-entry  physics  require  that  we 
estimate  the  angular  frequency  in  a  single  trigonometric  regression  function  when  the 
phase  and  amplitude  are  unknown.  When  confronted  with  the  task  of  designing  estima¬ 
tion  procedures  which  must  keep  up  with  data  arriving  in  real  time,  the  computational 
aspect  obviously  becomes  all  important.  In  many  problems  of  current  physical 
interest,  an  optimum  method  from  the  point  of  view  of  statistical  efficiency  (a  classical 
goal  of  theoretical  statistics)  leads  to  calculations  which  are  prohibitively  time 
consuming. 

Let  us  suppose  we  have  observations 

z^  =  p  cos(0t  -  (p)  +  v^_  (t=l,  2, . . . ,  n),  (1.1) 

where  {v  }  is  some  zero  mean  stationary  noise  process  and  0  <  0  <  7r  is  to  be 

estimated.  Consider  the  simplest  (unrealistic)  situation  in  which  <p  =  0  and  p  f  0  is 

2 

known.  When  the  errors  are  independent  normal  variates  with  variance  <j  ,  the 
Maximum  Likelihood  estimate  of  the  parameter  value  £  =cos  0  is  a  solution  of  the 
equation 

8IZtTt'«)  =  pU2n<{)  (1.2) 

t=l 

where  prime  denotes  differentiation  with  respect  to  £,  and  T^ij)  =  cosn0  and 
U  ,(£)  =  sinn0/sin  0  are  the  first  and  second  kind  Tchebichev  polynomials  in  £  of 

1  a  ^ 

degree  n  and  n-1  respectively.  If  there  is  a  solution  £  =£(z^,  • .  • ,  z  )  of  Eq.  (1.  2) 
which  consistently  estimates  £  as  n—  00 ,  then  0  =  cos  |  will  attain  the  Cramer -Rao 
lower  bound: 

£(0-0)2  e  ^  (1.3) 

P  n 


1 


The  price  paid  for  this  rate  of  convergence  would  be  a  difficult  and  lengthy 
computational  job.  Indeed,  £  is  a  solution  of  a  polynomial  of  degree  2n-l  with  the 
first  n  coefficients  depending  on  the  data.  In  many  applications  it  is  sheer  folly  to 
consider  solving  such  a  numerical  problem  in  real  time.  The  difficulty  is  of  course 
compounded  when  (p  and  p  are  unknown,  since  then  we  would  have  to  locate  the 
maximum  of  a  highly  nonlinear  function  of  three  variables.  (The  right  side  of 
Eq.  (1.3)  would  be  increased  by  a  factor  of  4,  independent  of  whether  or  not  p  is 
known. ) 

A  computationally  more  feasible  method  for  estimating  9  can  be  based  on  the 
theory  of  empirical  spectral  analysis  (see  Parzen,  Ref.  [  9])..  Let  the  noise  in  Eq.(l.  1)  now  be  a 
weakly  stationary  process  with  summable  covariances  cr(k)  =  (5vtvt+  j^j  •  For 
indices  k  =  0,±1,±2, . . . ,  ±(n-l),  define  sample  covariances 

Cnik)=HZxZ 


(1.4) 


t=l 


The  time  series  {zj-  is  called  asymptotically  stationary  because 

lim  £  C  (k)  =  ^p2cosk0  +cr(k) 


(1.5) 


for  each  fixed  k.  The  right  side  is  the  Stieltjes  cosine  transform  of  a  bounded 

,  2 

nondecreasing  function  having  a  discontinuity  of  magnitude  \  p  at  the  frequency  9. 

By  transforming  m=o('^/n)  of  the  covariances  in  Eq.  (1.  4)  with  suitable  weights 

which  depend  on  the  truncation  point  m,  one  obtains  a  sample  function  of  frequency 

S*(w).  In  large  samples  this  will  have  a  mean  value  approximately  equal  to  f(co)  if 

2  2  2 

cu  f  9,  but  at  9  equal  to  A  p  m  +  f(0)  where  A  is  a  known  constant.  One  expects, 
therefore,  the  graph  of  S*(  • )  to  have  a  dominant  peak  at  co  =  0  for  large  truncation 
points.  The  frequency,  say  9*,  at  which  S*(  •)  achieves  its  largest  value  is  a 
consistent  estimate  of  the  unknown  9. 

One  can  approximate  9 *  without  going  through  the  time  consuming  operations 
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involved  in  calculating  S*(w)  over  a  grid  of  cu  values.  (One  would  want  to  do  so,  of 

course,  when  there  is  interest  in  the  overall  shape  of  the  spectrum.)  This  is 

accomplished  by  self-convolving  the  even  sequence  A  C  (k)  =  ^  (k)  over  |k|  :£  m 

k  n  u 

to  obtain  a  new  sequence  ^(k),  normalized  so  that  =  1,  which  is  nonzero  for 

| k |  ^  2m.  Replacing  •)  by  $  (  •)>  we  generate  a  sequence  i/^k)  f°r  1^!  —  4m.  In 

a  recursive  fashion,  therefore,  we  arrive  at  i^(0)  =  •  •  •  ,^(K)  after  J 

iterations.  By  a  proper  choice  of  K  =  o(2^)  ,  the  number  of  axis  crossings  of  these 
K+l  numbers  (excluding  a  certain  small  band  around  the  zero  value  of  the  ordinate)  , 
after  division  by  K,  yields  a  ratio  which  converges  to  9*/tt  as  J  —  °°.(Cf.  Gardner,  Ref.  [4]  . ) 

Although  this  method  is  considerably  faster  than  Least  Squares,  and  correspondingly 
statistically  less  accurate,  it  is  still  in  essence  a  (large)  fixed  sample  size  procedure. 

Such  methods  are  generally  not  appropriate  for  real  time  estimation.  What  is  needed 
are  techniques  which  are  recursive  in  the  sample  size  n,  so  that  the  estimate  can 
be  rapidly  updated  as  each  new  observation  arrives  using  only  small  finite-memory 
computations.  Albert  and  Gardner  Ref.  [  1]  introduce  and  analyze  a  class  of 
estimates  of  a  parameter  in  certain  nonlinear  regression  problems  with  this  point  of 
view  in  mind.  It  is  appropriate  that  we  briefly  discuss  these  procedures  here.  Suppose 
at  time  t  wc  observe 


z  =  F(0)  +  v 
t  t  t 


(t  =  1,  2,  ... ) 


where  the  sequence  F  ( • ),  F  ( • ), .  . .  is  prescribed.  A  recursively  computed 

-L  Z 

^-estimate  (suggested  by  the  recursive  formula  for  the  Least  Squares  estimate  in  the 
linear  case)  is 


e 

n 


=  0n  l+anlZn"FA  1  ^ 
n-1  n  n  n  n-i 


(1.6) 


(n=l,  2, .  . .  ;  0q  arbitrary). 


The  procedure  (called  "successive  relinearization")  is  specified  by  a  choice  of  the 

so-called  gain  sequence  a  ,a  , . . . ,  where  a  may  or  may  not  depend  on  the  available 

j.  jL  n 
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iterates  0  ,  0  , . . . ,  0  .  Under  certain  restrictions  on  the  sequence  of  regression 

0  1  n-1 

function  derivatives,  and  the  assumption  that  {v^}  has  uniformly  bounded  second 

moments,  0^  is  a  strongly  consistent  estimate  of  0  as  n  —  00  for  a  wide  variety  of 

recursively  computable  gains.  For  the  case  of  independent  errors  with  common  variance, 

a  transformed  version  of  Eq.  (1. 6)  with  particular  iterate -dependent  gains  is  asymptotically 

efficient  when  (and  only  when)  these  errors  are  Gaussian.  However,  the  mean  square 

error  of  the  estimate  has  the  proper  n  dependence  in  all  cases,  i.e.  it  goes  to  0  like 
n  ? 

(an  assumption)  1/  *  (0)  as  n  —  00 .  Those  familiar  with  the  behavior  of  stochastic 

approximation  schemes  (to  which  the  above  bears  a  very  close  formal  similarity)  will 
not  be  surprised  that  F(  •)  is  required  to  be  monotone  over  the  parameter  space  9 
for  all  sufficiently  large  time  indices: 

sgn  Ft'(x)  =  st  =  ±1 

independent  of  the  argument  xe9.  Such  obviously  fails  to  hold  for  the  regression 
function  in  Eq.  (1. 1),  because  as  time  progresses  it  has  an  increasing  number  of 
zeroes  with  respect  to  0.  Even  if  the  nuisance  parameters  cp  and  p  were  known, 
then,  we  could  not  directly  apply  this  existing  technique  for  nonlinear  regression 
problems.  (Can  a  scheme  of  the  form  of  Eq.  (1.  6)  be  exhibited  which  consistently 
estimates  0  for  F^(0)  =  cos  0t?) 

We  are  going  to  treat  the  frequency  estimation  problem  in  its  own  right,  and 
introduce  two  0 -estimates  which  are  simple  functions  of  at  most  the  first  three 
sample  covariances  in  Eq.(1.4).  Since  the  lags  are  fixed  and  do  not  increase  with 
sample  size,  these  estimates  are  easily  generated  in  a  recursive  fashion.  They  are, 
of  course,  highly  inefficient  relative  to  estimates  requiring  an  unlimited  number  of 
covariances,  but  this  is  the  price  paid  for  computational  simplicity.  Actually,  we 
will  deal  with  a  model  which  is  more  general  than  Eq.  (1. 1);  viz. , 

zt  =  £[  pcos(  0t  -  (p)]  +u.  (1.7) 
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where  £[  •]  is  a  realizable  linear  operator  with  summable  impulse  response 
coefficients.  We  will  assume  {uj  is  a  one-sided  zero  mean  linear  process  (and 
ultimately  Gaussian  for  simplicity)  with  a  summable  fourth  order  moment  sequence. 

In  the  special  case 

u.=£vt>  (1.8) 

we  obtain  Eq.  (1. 1)  passed  through  a  linear  filter.  Although  Eq.  (1.  8)  is  the  case  in 
most  applications,  there  is  no  reason  to  so  specialize  Eq.  (1. 7)  because  the  mathe¬ 
matics  is  essentially  the  same. 

In  Sec.  2  we  derive  the  limiting  statistical  behavior  of  the  sample  covariances  in 
Eq.  (1.4)  when  {z^}  is  given  by  Eq.  (1.7).  In  generalization  of  Eq.  (1.5)  we  have,  as 
a  limit  as  n  —  00  both  with  probability  one  and  in  mean  square, 

Cn(k)  -  C(k)  =  p2|  A(0)  | 2  coskfl  +  cr(k) 

for  every  fixed  k,  where  A(.)  is  the  complex-valued  transfer  function  of  the 
operator  £  and  <r(  •)  now  denotes  the  covariance  sequence  of  {u^} .  Any  finite 
collection  of  random  variables 

Dn(k)  =  ^(Cn(k)  -  C(k)), 

corresponding  to  distinct  choices  of  the  lag  variable  k,  tend  to  joint  normality.  When 

{u  1  is  Gaussian,  the  covariance  between  D  (k.)  and  D  (k_)  of  the  limiting  distribu- 
1  r  n  1  n  2  ° 

tion,  written  in  terms  of  frequency,  is 

n  2 
cosk  w  cosk  o?f  (ai)dco 
1  z 

7T 

where  f( . )  is  the  spectral  density  function  of  {u^} . 

Using  these  results  we  introduce  in  Secs.  3  and  4  two  different  methods  for 
estimating  |  =  cosO  (-1  <  |  <  1).  We  present  them  here  only  for  the  special  case 


k  If 
12 


2  2 

=  47 rp  |A(0)|  f(0)coskj0  cosk20  + 47t 


J 
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r  i  ^ 

when  {u^}  is  a  white  Gaussian  process  with  variance  cr  .  The  respective  estimates 

are  then 


i  - 


Cn<‘> 


C  (O)-cr 
n 


IT  Cn<2)+  JcWsC^!) 

II  _  n  V  n  n 

n  4  C  (1) 

n 


We  estimate  9  by  cos  .  In  each  case,  's/nd  -|)  tends  to  be  normally  distributed 

n  2 

about  0  as  n  —  00  with  a  variance  V  (£).  These  functions  are  respectively  given  by 


2  2 

R  vi(|)  = 


A(0)|' 


■(2|  +1) 


2  2 

R  Vn(4)  = 


1 _  4|4-3|2  +  1 


A(0)  | 4  (2|2+1)2 


2  2  2  2 
where  R  =  /a  is  the  "signal  to  noise"  ratio.  We  have  (|)  ^  1  with 

equality  only  at  9  =  ir/2,  so  that  the  additional  computation  in  Method  II  offsets  the 

2  I 

knowledge  of  cr  presumed  in  Method  I.  In  the  general  nonwhite  case,  £  assumes 

n  n 

knowledge  of  cr(0)  and  a(l)  and  ^  of  cr(l)/cr(0)  and  a(2)/a(0).  In  addition,  the  latter 
involves  the  sample  variance  0^(0). 

Figure  1  is  a  computational  flow  chart  (without  initialization)  for  Method  II  in  the 
white  noise  case.  The  input  to  a  "box"  is  to  be  multiplied  by  the  value  of  the  enclosed 
symbol,  while  "circled"  symbols  indicate  operations  to  be  performed  on  the  input  (I). 
The  top  segment  of  the  diagram  is  the  recursive  covariance  calculation  for  lags  1  and 
2.  This  portion  is  run  continuously,  i.e.  the  time  index  n  is  stepped  by  unity.  When 
an  estimate  of  9  is  desired  it  is  only  necessary  to  connect  the  bottom  segment  to  the 
top  via  the  indicated  terminals  1  and  2. 
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Fig.  1.  Method  II  in  the  white  noise  case. 
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The  methods  are  applied  to  two  illustrative  situations: 


1. 

2. 


£  =  A*5  and  u^  =  £w^_  with  {w^}  white,  which  results  from  removing  a 
polynomial  trend  by  differencing. 

£  =  { 1,  -a^,  . . . ,  -a^}  and  u^  =  w^  versus  £  =  { 1}  and 


u=a,u  ,  +  •••+  a  u  +  w  ,  i. e.  a  comparison  of  prewhitening 
t  1  t-1  p  t-p  t 

and  no  prewhitening  when  a  regression  pcos(0t  -  cp)  is  disturbed  by 
an  autogressive  scheme. 


Numerical  results  are  presented  for  these  applications. 
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2.  SOME  LIMIT  THEOREMS 


In  this  section  we  consider  time  series  of  the  form 

z  =  £[pcos(  0t  -  (p)]  +u 

where  £  [  •]  denotes  any  realizable  linear  operator  with  summable  impulse  response 
coefficients,  and  {u^}  is  a  centered  stationary  process  whose  moments  up  through  order  4 
obey  certain  restrictions.  The  parameters  defining  the  regression  sequence  are 
arbitrary,  with  the  exception  that  9  is  presumed  to  be  an  interior  point  of  (0,  n). 
Specifically,  we  are  interested  in  the  joint  asymptotic  statistical  properties  of  sample 
covariances 

n 

z  z  ,  , 
ct  ct+h 

t=l 

corresponding  to  distinct  choices  of  the  lag  variable  h,  when  c  is  an  arbitrarily  fixed 

integer.  We  first  deal  with  the  deterministic  components. 

Theorem  1.  Let  £  =  {c^,  a2’  •  •  }  be  a  linear  operator  defined  by 


tty  +  o;  v  ,+0!„y  +■ 

(Tt  lyt-l  2yt-2 


for  which  |  c^q  |  +  |  a;  ^  |  +  |  |  +  •  •  •  <  00  *  ^'et 

g  (t)  =  £pcos(0t  -  cp)  (t  =  0,±l,±2, . . . ) 

0 

where  p  is  an  arbitrary  constant.  Choose  integers  a,b  and  c  (positive  or  negative). 
Then,  for  any  0  <  9  <  ir  and  <p ,  and  some  K  which  does  not  depend  on  a  or  b, 

n 


g8(Ct+a)g«<Ct+b) 


|p2|A(0)|  2  cos(b-a)0 


K 

n 


t=l 
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wherein 


A(6)  =')  a.e1^  . 
J=0 


Proof.  It  is  notationally  convenient  to  work  with  the  Tchebichev  polynomials  of 
the  first  and  second  kind 


Tt(£)  =  cost0 


ut<«>  ■ 


sin(t+l)fl 

sin0 


(2.1) 


which  are  of  degree  t  s  0  in 


£=cos0  (-1<£<1).  (2.2) 

Then  i - \ 

g0(t)  =  pcos  cpZTt(Z)  +  psmcp  Jl-H2  filled)  (2.3) 

where  T  ^  and  U  £  j=  -U  Without  loss  of  generality  we  can  obviously  assume 
p  =  1.  Deleting  the  fixed  arguments  6  and  | ,  we  then  have 

g(t+a)  g(t+b)  =  cos2<^£Tt+a£Tt+b 


+  sln2'»<1-«2>Ium-iIut+b-i  <2-4> 


+  cos  (psiiup  Vl-£2  (£T  ,  £U  +  £T  lt£U  ,  ) 
Y  Y  v  t+a  t+b-1  t+b  t+a-1  ' 


for  all  negative  and  positive  integers  a,b  and  t.  (It  should  be  clear  that  by  £Tt+££T  ^ 

we  mean  £[T  ,  ]  times  £[T  J  and  not  £  [  T  ,  £[T  ,.]].)  In  terms  of  the 
t+a  t+b  t+a  t+b 

polynomials,  standard  trigonometric  addition  formulae  become 


2  T  U 
t  s 


=  T  +  T 
t+s  t-s 


2<l-«  >U,-lUs-l 


=  T  -  T 
t-s  t+s 


T  U  .  +  T  U  . 
t  s-1  s  t-1 


=  U 


t+s-1 


(2.5) 
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Using  the  first  (resp.  second)  line  of  Eq.  (2.5)  in  the  first  (resp.  second)  line  of 
Eq.  (2.  4)  we  find,  with  the  abbreviations 


that 


h  =  b-a  -t  =  b+a, 


£T  ,  £T  ,,  =  )  oi  a.  T  .TV,  . 
t+a  t+b  l_j  l  j  t+a-i  t+b-j 

i.j^ 


and 


=  ->  aafT  +T  ) 

2  Lj  i  y  2t+l-i-j  h+i-j  ’ 

i»  j  —0 


<1-«2>1Ut4a-l£Ut4t,-l'  1  I  VjfVl  '  T2t+-t-i-j  >' 

i>  j— 0 

Similarly,  in  the  third  term  of  Eq.  (2.  4)  we  use  the  third  line  of  Eq.  (2.  5)  and  get 


£T  ,  £U  ,  +£T  .  £U. ,  , 

t+a  t+b- 1  t+b  t+a- 1 


=  )  Q'.Q'.(T  ,  .U  .  ,  +T  tl_  .U  L  .  ,) 
/_j  l  y  t+a-i  t+b-j -1  t+b-i  t+a-j-1 

i,j^0 


=  /  oi  a  U_ 

Lj  i  j  2t+-t-i-j-l 

i,j  ^0 

because  the  interchange  of  i  and  j  in  the  second  product  does  not  alter  the  value  of 
the  double  sum.  We  substitute  these  last  three  equations  into  Eq.  (2.  4),  replace  the 
dummy  t  by  ct,  and  then  sum  over  t  =  1,  2, . . .  ,n.  There  results 


11 


n 

i  ^  g(ct+a) g(ct+b)  =  i  £  VjTh+j-i 
t=l  i.jSO 


n 


+  cos  2<p  )  a. a.  —  /  T  .  .  . 
2  Lj  i  j  n  2ct+'l-i-j 

i»  j— 0 


L  t=l 


sin2<p 


i,  j^O 


n  l_j  2ct+-t-i-j-l 


t=l 


(2.  6) 


Now  from  the  summation  formulae  (Knopp,  p.  480,  Ref.  [  6]  ) 


n 


I 

t=l 


cos  2tco 


I 


sin  2tco 


t=l 


sin(2n-t-l)co  1_  _  sinnco  cos(n+l)co 
2  sinco  2  sinco 

(2.7) 

cos  co  -  cos(2irfl)co  _  sin  nco  sin(n-t-l)co 
2sinco  sin  co  * 


which  hold  for  all  real  numbers  co, 


we  see  that 


n  n 


t=l  t=l 


rl-£ 


U 


2ct+r 


.!«)! 


t=l 


^  2/  |  sinc0  | 

=  0(1)  provided  0  ^  0,  n 


(2.8) 


as  n  —  00  independently  of  r  =  0,±1, . . .  .  Since  {a.}  is  square  summable  we  can  take 

limits  inside  the  double  summations.  Hence,  both  the  second  and  third  terms  in 
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Eq.  (2.  6)  go  to  0  as  1/n,  and  do  so  uniformly  in  l  (but  not  in  0).  The  remaining  term 
is 


^  ^  a.  a  cos(h+-j-i)0  =  -|cosh0  ^  a  .a  cos(i-j)0 

i— 0  j>0  i>0  j^O 


2  2 

=  -^cosh0  ^  ^Q^.cosj0^  +  ^  ^  Q!j  sinj0^  J 


<  00 


j^o 


j^O 


which,  since  h  =  b-a,  is  the  asserted  result  with  p  set  to  1.  Q.  E.D. 

Remark.  Although  we  will  have  no  need  for  the  result,  we  note  in  passing  what 
happens  at  the  endpoints.  From  Eq.  (2.  6)  we  find  the  formula,  after  returning  the 
multiplier  p, 


l 

n 


n 

I 

t=l 


2  2.  ,2 

gg (ct+a) gg (ct+b)  =  cos  (p  p  |A(0)  |  cos(b-a)9 


(0  =  0  or  7r) 


which  is  valid  for  every  n. 

Remark.  The  conclusion  of  Theorem  1  is  invariant  under  translations  of  the  time 
index  on  which  £  operates.  Thus,  we  could  just  as  well  have  started  with  £y^  written 
forward  in  time,  or  even  as  a  two-sided  operator, 


a  v 
jyt-j 


j=-eo 


We  have  taken  £  to  be  realizable  since  we  will  be  interested  only  in  statistical 
applications.  An  important  special  case  is 


£  =  AP 


where  A  is  the  first  backward  difference  operator.  We  have 
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for  j=0, 1,  ...,p 


0 


for  j  2:  p+1 


(2.9) 


The  spectrum  of  this  operator  at  9  is 

P 


|A(0)|2  =  ^  (-i)J 

j=0 


P 


j=0 


=  (l-ei0)P  (l-e"i0)P 


(2.10) 


This  particular  filter  removes  from  a  series  {y^}  any  polynomial  trend  of  degree 
not  exceeding  p-1. 

Theorem  2.  Let  {uj-  be  a  stationary  time  series  with 


<fv°  fVt+|kf  ^ 

and  cr(  • )  summable.  Assume  also  that  {u^}  has  a  summable  fourth  order  moment 
sequence.  Let 


Z  =  g.(t)  +  u  (t  =  0,±l,±2, . . . ) 

t  tf  L 

where  g.  (t)  is  defined  in  Theorem  1.  Then,  for  any  fixed  integers  h  and  c, 
6 
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n 


—  >  z  z 
n  /_!  ct  ct+h 

t=l 


2  p2 1  A(0)  | 2  coshP  +  a(h) 


in  mean  square  and  with  probability  one  as  n  — 

Proof.  We  introduce  abbreviations 

n 


C(h)  =  ~  /  z  z 
n  n  l_/  ct  ct+h 

t=l 

T0(h)  =  |p2|A(0)|2cosh0  , 


(2.11) 


which  will  be  used  throughout.  We  further  let 

n 


X  (h)  =  7  >  g  (ct+h)u 
n  n  lj  v  ct 

t=l 

n 

Yn®-:2*»WuctH.  <2'12) 

t=l 

n 

Z  (h)  =  -  )uu  -  <r(h). 
n  n/j  ct  ct+h 

t=l 

By  hypothesis  these  latter  random  variables  are  centered  at  expectations.  It  follows 
from  the  definitions  and  Theorem  1  that 


C  (h)  -  yfl(h)  -  <r(h)  =  X  (h)  +  Y  (h)  +  Z  (h)  +  0(l/n)  (2.  13) 

n  v  n  n  n 

where  the  order  term  is  a  sure  one  as  n  —  °°.  The  conclusion  will  be  at  hand  if  we 
can  show  that  each  of  the  averages  in  Eq.  (2. 12)  tends  to  0  both  in  mean  square  and 
with  probability  one.  Such  is  easily  accomplished  with  the  help  of  the  following 
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Law  of  Large  Numbers  for  dependent  random  variables  (see  Parzen,p.  419,  Ref.  [  7] ): 
n 

Let  x  =  2  x  /n  where  x,,x_, ...  is  a  sequence  of  centered  random  variables  with 
n  t=  1  t  12 

uniformly  bounded  variances.  Then  a  sufficient(and  also  necessary)  condition  that 

^x  -Oasn-'ois/xx  —0.  Furthermore,  if  f  x  x  =  0(l/n  )  for  some  e  >  0, 
-  n  n  n  '•nn 

then  x  —  0  with  probability  one. 

The  first  two  averages  in  Eq.  (2. 12)  are  trivial  to  handle.  For  the  first  we  have 

n 

|(fXn(h)g0(cn+h)ucn|  =  “  I  ^  g0(ct+h)g0(cn+h)a(c(n-t))| 

t=l 

n^l 

^•£2fL’  ^  |cr(ct)|=0(l/n) 
t=0 

because  2  |  cr(ct)  |  <  °°  obviously  follows  from  2|cr(t)|<  00 .  The  same  kind  of  bound 
t  t 

obviously  holds  for  Y  (h).  For  the  remaining  sequence  we  find 

n 

n 

|  f  Z  (h) [ u  u  -  <x(h)]  |  s  -  Y  |(fu  u  u  u  -  a2(h)|  .  (2.14) 

1  c  n  cn  cn+h  1  n  1  u  ct  ct+h  cn  cn+h  1 

t=l 

The  fourth  order  moment  sequence 

'‘O'l-W  =  £utuH*1uHk2“tt*3 

can  be  written  in  the  form 

=  <7(kl)lr<1'2'k3)  +  ff<k2)a<krk3)  +  c,(k3)c,(krlt2)+  flNG<kl’k2,k3) 

(2. 15) 

where  tr  is  the  fourth  cumulant  sequence.  It  is  called  the  non -Gaussian  contribution 
NG 

to  n  because  it  vanishes  identically  when  {uj  is  a  Gaussian  process.  If  in  Eq.  (2. 15) 

r  2 

we  set  k  =  h,  k  =  c(n-t)  and  k  =  c(n-t)  +  h,  then  the  a  (h)  term  in  Eq.  (2. 14)  is 
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cancelled  by  the  leading  term  in  the  Gaussian  contribution  to  p.  After  reversing  the 
direction  of  summation,  we  therefore  have 


I  £  I  u  « 


n  cn  cn+h 
n-1 


a(h)] 


1  2 

^  —  )  la  (ct)  +  a(ct+h)a(ct-h)  +  p  (h,  ct,  ct+h)| 
n  NO 


t=0 


which  by  the  summability  assumption  is  0(l/n). 

Theorem  3.  In  the  notation  of  Theorem  2,  introduce  the  deviations 

^  n 


Q.  E.D. 


D  (h)  =  's/n 
n 


t=l 


i  2 

Z  Z  -  i  " 

ct  ct+h 


p  |A(0)|  cosh0  -  a(h) 


where  h  is  an  arbitrary  fixed  integer.  In  addition  to  the  restrictions  of  Theorem  2, 
suppose  that 

(t.  kj>k2  =  0,±1,±2, . . . )  . 


Then,  for  all  integers  a,b  and  c, 


lim  f  D  (a)  D  (b)  = 
n  n  n 


2  2 

2 p  j  A(0)  |  )  a(ck)  cos  ck0  .  cosa0  cos  b0 

k=-°° 


+  [  a(ck)  a(ck+b-a)  +  a(ck+b)  a(ck  -a)  ] 


k=-°° 


1  t‘NG(a'ck’ 


ck+b) 


k=-°° 


where  p_„  is  the  fourth  order  cumulant  sequence  of  {u  }. 

NG  t 

Proof.  From  Eq.  (2. 12),  whether  a  is  equal  or  unequal  to  b, 
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(2. 16) 


n  n 


fw)=  -4X  i«8<ct+a)[fUctUcsUcS*i'^Uct<r(b)1 
n  .  . 


t=l  s=l 


=  0 

for  every  n  by  our  simplifying  assumption  concerning  third  moments.  From  Eq.  (2. 13) 
we  therefore  have  the  formula 

<fD  <a)D  <b)  =  n<f[X  <a)X  (b)  +  Y  <a)Y  (b) 
n  n  v  n  n  n  n 

+  X  (a) Y  (b)  +  X  (b)  Y  (a)  (2. 17) 

n  n  n  n 

+  Z  (a)  Z  (b)  ]  . 
n  n 


We  handle  all  the  terms,  except  the  last,  in  the  following  fashion.  Let  m  =  m(n)  tend 
to  infinity  over  positive  integers  with  n  in  such  a  way  that 

2, 

m  /n  —  0, 


and  define 


Put 


<J  (ck)  = 
n 


cr(ck) 


for  |  k  |  =£  m 
for  |  k  |  s  m+1  . 


n  n 

S  =;  y  y  g(ct+a) g(cs-rt») a  (c(s-t)) 
n  n  /  j  /  i  n 

t=l  S=1 


(2. 18) 


(2. 19) 


where  9  is  fixed  and  for  the  moment  now  shown.  If  we  write  o(  • )  in  place  of  a^(  • )  , 
this  becomes  the  first  quantity  of  interest,  n^X(a)X^(b).  When  we  make  the  index 
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change  from  s  to  k=s-t,  we  obtain 


n 


s=”  y  g(ct+a)f  (t) 
n  n  L-i  n 


with 


t=l 

n-t 


f  (t)  =  y  g(c  (t+k)+b)  o  (ck) . 
n  n 


k=-t+l 

There  are  three  ranges  of  t  to  be  considered  in  accordance  with  Eq.  (2.18).  Since 
n-2m  —  00 ,  we  have  for  all  large  enough  n 
m 


'n«  = 


l 


k=-t+l 

m 

I 

k=-m 

n-t 

l  , 

k=-m  ' 


S  g(c(t+k)  +b)  cr(ck) 


lst<m 


for  < 


m+1  ^  t  ^  n-m 


(2.  20) 


n-m+1  ^  t  £  n  . 


Therefore, 


n-m 


m 


S  =  ~  y  g(ct+a)  y  g(c(t+k)+b)  a(ck) 
n  n  L-i  Lj 


t=m+l  k=-m 

+1 

n 


m  n 


g(ct+a)fn(t) 


t=l  t=n-m+l 


Now  in  both  the  first  and  third  sums  in  Eq.  (2.  20)  there  are  at  most  2m  summands 
possessing  a  uniform  bound.  Hence 
m 


n-m 


S  =  )  o-(ck) 

k=-m  ^  t=m+l 


’n=  Z  a*Ck*  n  S^ct+a)  g(c(t+k)+b) 


+  0(m  /n) 
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We  write  the  sum  in  square  brackets  as 


n  m 


-  ^  ^  ^  g(ct+a)  g(ct+ck+b)  . 

t=l  t=l  t=n-m+l 


Herein  the  second  and  third  sums  are  bounded  in  absolute  value  by  m  times  a  constant 
independent  of  k.  We  therefore  have 


m 


s„  =  £  <**) 

k=-m 


n 


V  g(ct+a)  g(ct+ck+b) 


t=l 


m 


+  0(m/n)  cr(ck)  +  0(m-/n). 
k=-m 


But  according  to  Theorem  1  the  average  within  square  brackets,  using  the  notation  in 

Eq.  (2. 11),  differs  in  absolute  value  from  v  (ck+b-a)  by  no  more  than  K/n  where  the 

9 

constant  can  be  taken  independent  of  k.  From  the  summability  of  ct(  • )  and  the  choice 
of  m,  there  follows 


lim  S  =  /  cr(ck)  y  (ck+b-a)  <  °°  . 
n  n  L-i  9 


(2.21) 


k=-°° 


Finally,  since  there  are  n-  |k|  indices  t,  s  =  1,  2, . . .  ,n  for  which  the  difference  s-t 
equals  k,  we  have  from  Eq.  (2. 12),  Eq.  (2. 18)  and  Eq.  (2. 19) 

n  n 

|n  £Xn(a)Xn(b)  _  SJ  ~  ~  ^  ^  | cr(c(s-t))  -  tfn(c(s-t))| 

t=l  s=l 

=  K'  ^  ^  1-  ^  |a(ck)  -  o-n(ck)| 

|k|^i-l 

=  K'  I  (l"T-)  Hck)!  • 

m+1  ^  |  k  |  ^n-1 


20 


This  goes  to  0  as  n  >  m  —  00  because  the  infinite  summation  is  finite. 

Precisely  the  same  limit  in  Eq.  (2.  21)  is  found  to  hold  for  the  second  term  in 
Eq.  (2. 17).  Therefore,  for  any  a  and  b 


lim  n  fx  (a)X  (b)  =  lim  n  f  Y  (a)  Y  (b) 
n  n  n  n  n  n 


=  2 


cr(ck)  y  (ck+b-a) 
6 


a(ck)  cos  ck0 


cos(b-a)0 


where  the  last  equality  is  a  consequence  of  the  definition  in  Eq.  (2. 11)  of  y  ( •)  and  the 

U 

fact  that  cr(ck)  sinckfl  is  an  odd  function  of  integers  k.  In  a  similar  fashion,  we 
obtain 

lim  n  fx  (a)  Y  (b)  =  lim  n  fx  (b)Y  (a) 
n  n  n  n  n  n 


VAJ 

=  2 


<r(ck)  y  (ck+bfa) 
6 


k=-°° 


Therefore,  taking  limits  in  Eq.  (2. 17), 

OO 

lim/^D  (a)D  (b)  =  2p2|A(0)|2  )  a(ck)cosck0 

n  n  n  L  '  Lj 

k=-°° 

+  lim  n  /  Z  (a)  Z  (b)  . 

L  n  n 


cos  a0  cosb0 


(2.  22) 


With  regard  to  the  remaining  limit  we  have  from  Eqs.  (2. 12)  and  (2. 15) 
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n  n 


nfz  (a)Z  (b)  =  -  )  /flu  u  ,  -a(a)][u  u  ,  -  a(b)] 
u  n  nv  n  Zy  Zy  L  ct  ct+a  v  cs  cs4*>  v 


t=l  s=l 


n  n 


=  —  /  /  [f  u  u  u  u  -<7(a)o-(b)] 

n  Z  Z  c  ct  ct+a  cs  cs+b  '  '  v  ' 


t=l  s=l 


n  n 


=  ^  ^  ^  [  <r(c(s-t))  <r(c(s-t)+b-a)  +  a(c(s-t)+b)  <r(c(s-t)-a) 


t=l  s=l 


+  MNG(a»  c(s-t),  c(s-t)+b)] 


=  ^  |^1  — ^  [  a(ck)  a(ck-Hb-a)  +  a(ck+b)  a(ck-a)  +  /^^(a,  ck,  ck+b)] 


k  sn-l 


Using  the  summability  assumptions,  and  letting  n  go  to  infinity,  the  resulting  limit 
combines  with  Eq.  (2.  22)  to  give  the  asserted  formula.  Q.  E.D. 

Remark.  The  restriction  to  processes-}^}  with  vanishing  third  order  moment 
sequences  was  made  solely  for  the  purpose  of  simplifying  the  final  formula.  When 
the  assumption  fails,  the  result  will  have  two  additional  terms  (cf.  Eq.  (2. 16)) 
involving  the  third  order  moment  sequence  of  two  indices. 

Remark.  The  non-Gaussian  contribution  to  the  limiting  value  of  £  Dn(a)Dn(b)  takes 
on  a  particularly  simple  form  when  {u_}  is  a  one-sided  (for  physical  realizability) 
linear  process  in  independent  random  variables  {w^J-  ,  i.e. 

Ut  =  /?0Wt  +  /3lWt-l  +/32Wt-2+”'  (^l£jl<0°>-  (2*23> 

j&0 

If  we  express  the  input  process  moments  in  terms  of  cumulants 
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t 

c 

t 

t 


W 

t 

2 


w 


4 

w 

t 


=  0 


2 

a 


=  k. 


=  K,  +  3(7  , 
4 


then  we  have 

a(k)=  *2^jV 
j 

£UtUt+kl\+k2  =  K3^j^j+k1Vk2  (2.24) 

j 

^NG^l’W  =  K4  ^  ^jVk^j+k/j+kg 

j 

where  ^  =  0  for  all  j  <  0.  From  these  we  find  the  representation 

00  K 

^  MNG(a>  ck>  ck+b)  =  ~~J~  CT(a)  o'O3)  (2-  25) 

i  O' 

k=-00 

4  2 

independent  of  c.  K^/cr  is  the  familiar  "coefficient  of  excess"  over  a  N(0,  a  ) 

distribution  of  {w^},  and  it  vanishes  for  such  a  distribution. 

In  summary,  we  specialize  our  preceding  results  to  linear  Gaussian  process,  since 
in  such  cases  it  follows  almost  immediately  that  the  random  variables  D^fli),  corres¬ 
ponding  to  distinct  choices  of  the  lag  variable,  tend  to  normality. 

Theorem  4.  Let  {u^.}  be  a  one-sided  linear  process  in  independently  identically 
distributed  0  mean  normal  random  variables,  and  suppose  its  covariance  sequence 

£vt+|k| 
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is  absolutely  summable.  Let  £  =  {a^,  o;^,  a^, . . . }  be  any  linear  operator  with 
absolutely  summable  impulse  response  coefficients,  and  let 

OO 

A(o>)  = 

j=0 

be  its  complex  valued  transfer  function.  Introduce  the  time  series 


=  £[  p  cos(0t  -  cp)\  +  u^  (t  =  0,  ±1,±2, . . .  ) 


where  0  <  0  <  tt  and  p,  cp  are  arbitrary.  Fix  an  integer  c,  and  define  the  random 
variables 


n 


]n(h)  =  n  X 


z  z  ,  , 
ct  ct+h 


t=i 


and  sure  quantities 


Y0(h)  =  5p2|A(0)|2cosh0 


^0  (a’  b>  =f  2P 


ou 

|AOT|2  £ 


cr(ck)  co  s  ck0  ]  co  s  a0  co  s  b0 


k=-°° 

OO 

+  'y  [  <r(ck)  cr(ck-t-b-a)  +  <r(ck+b)  a(ck-a)  ] 
k=-<» 


where  h,  a  and  b  are  integers.  Finally,  put 


Dn(h)  =  ^[Cn(h)-Y0(h)-cr(h)]  . 


Then 

(i)  C  (h)  —  yn(h)  +  <r(h)  as  n-  00  both  in  mean  square  and  with  probability 
n  6 

one. 

(ii)  If  h  =  (h. ,  h„, . . . ,  h  )  is  any  finite  collection  of  distinct  integers, 

"■  J.  z  r 
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then  the  distribution  of  the  vector  of  random  variables 


D  (h  ),D  (h  ),..., D  (h  ) 
n  i  n  z  nr 


tends  to  that  of  a  centered  r-variate  normal  distribution  as  n  -  The 

entry  of  the  covariance  matrix  of  the  limiting  distribution  corresponding 

to  lx,  h  e  h  is  equal  to  the  limiting  value  of  the  corresponding  covariance, 

and  their  common  value  is  ^  (h. ,  h.) .  (i,  j  =  1,  2,  . . . ,  r). 

v  1  j 

Proof.  From  Eq.  (2.  12),  *Jn  X  (h)  and  \fn  Y  (h)  are  normal  for  every  n.  It  is 

n  n 

a  known  result  in  time  series  analysis  (Parzen,  Ref.  [  8] )  that 


n/5  Z  (h.),  ^  z  (h  ), . . . ,  -sft  z  (h  ) 

n  l  n  2  nr 

tend  to  joint  normality  as  n  —  00  (for  linear  processes  {u^}  but  not  necessarily  normal). 

Moreover,  the  covariances  of  the  limiting  distribution  are  given  by  the  limiting  values 

of  nfz  (h.)  Z  (h.).  It  remains,  therefore,  to  multiply  Eq.  (2. 13)  through  by  \Tn  and 
v"  n  1  n  j 

evaluate  second  order  moments.  (The  bias  is  0(1/  sJn  )  so  mean  square  and  variance 

are  asymptotically  the  same.)  Theorem  3  gives  the  result  of  these  calculations, 

where  /u  is  0  by  hypothesis.  Q.  E.D. 

NG 

Remark.  The  function 

~  -  ■■  "  ~  OO 

f  <x(ck)  cos  cka>  ,  (2.26) 

C  27T  Lj 

k=-oo 

whose  value  at  u>=0  appears  in  the  limiting  covariance  ip (a,b),  becomes  the  spectral 

9 

density  function  of  {u  }  for  c=l.  (It  is  an  interesting  problem  to  try  to  express  f  ( •) 

t  c 

in  terms  of  c  and  f^(  • )  . )  From  Eq.  (2.  23)  it  follows  that 

f(o>)  =  =  I  ^.eIj"  I2  .  (2.27) 

j  -0 
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(2.  28) 


Thus,  incase  uf  =  Jlw^,  i.  e.  /T  =  ay  we  will  have 

2 

£(“)'  -57  lA<")l 

for  all  co.  The  second  summation  in  ip  ( a,  b),  for  c=l,  can  be  written  in  terms  of  the 

0 

spectral  density  (see  (4. 32)  below). 
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3.  METHOD  I 


We  are  now  going  to  apply  the  results  of  the  previous  section  to  the  following 
statistical  problem:  given  observations 


zt  =  £[pcos(0t-<p)]  H-u.  (t=tQ,t0+l, . . .)  (3.1) 

estimate  9  e(0,  n)  without  knowledge  of  (p  or  p  f  0.  In  accordance  with  our  basic 
ground  rule,  we  want  to  consider  methods  involving  finite-memory  computations 
which  can  be  performed  rapidly  (the  latter,  of  course,  being  a  relative  requirement). 
The  model  in  Eq.  (3. 1)  can  arise  in  different  ways,  but  more  often  than  not  it  is  the 
output  of  a  linear  filter  whose  input  is  disturbed  by  additive  noise.  That  is,  u^  =  £vf 
for  some  process  {v  }.  To  be  concrete,  we  introduce  two  such  situations  which 
subsequently  are  used  to  illustrate  our  results. 

Problem  1.  Estimate  9  from  data 


y  =  (polynomial  in  t  of  degree  at  most  p-1) 

*  (3.2) 

+  pcos(0t  -  cp)  +  w  (t  =  l,2, ...) 

r  i  2 

where  {w^}  is  a  white  zero  mean  process  with  variance  a  ,  and  the  unknown  polynomial 

trend  is  not  of  interest.  The  latter  can  be  removed,  as  already  remarked  after  the 

th  d 

proof  of  Theorem  1,  by  applying  the  p  order  difference  operator  Jo  =  A^  to  Eq.  (3.  2). 
Thus, 

aP  aP 

z  =  A  y„  u  =  A  w„ 

t  t  t  t 


with  tg  =  p+1  when  we  take  A  to  be  the  first  backward  difference.  The  covariance 

between  u  and  u  for  a  general  Jo  with  memory  of  order  p  applied  to  white  noise  is  , 
t  t+k 

for  k^O, 


<r(k) 


P  P  p-k 

=  /  /  a, a.  fw  .w  ,  .  =  o'  )  ot,a.  , 

Lj  Lj  i  J  c  t-i  t+k-j  j  ]+k 


i=0  j=0 


j=0 
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Upon  substituting  Eq.  (2.  9),  this  reduces  to 


a(k)  = 


0 


k  =  0,±l, . . . ,±p 


|k|  >P 


(3.3) 


(see  Feller,  p.62,  Ref.  [3]). 

Problem  2.  Estimate  9  from  data 


yt  =  pcos(0t  -  (p)  +  v  (t=l,2, ...) 


(3.4) 


where  {v  }  is  an  autogressive  scheme  of  order  p  driven  by  white  noise  with  variance 

2  1 
c r  ,  i.  e. 

v=a,v  ,  +  •••  +  a  v  +w. 
t  1  t-1  p  t-p  t 

If  the  coefficients  are  known,  we  can  apply  the  operator  £  defined  by 


O'  =  1  a.  =  -a. 
0  J  J 


(j— l,2,...,p). 


We  then  have  Eq.  (3. 1)  with 

2 
<j 

<j(k)  =  < 


k  =  0 


(3.5) 


0  k  f  0 


because  u^  =  £v^  =  w^.  On  the  other  hand,  if  the  coefficients  are  not  known  we  perform 

no  filtering,  and  impose  conditions  on  the  a.’s  so  that  u  =  v  obeys  the  moment 

J  t  t 

restrictions  imposed  by  the  theorems  of  Sec.  2.  In  this  case,  we  would  have 


o(k)/a(0)  =b,  X^  +  -  ••  +b  Ak 

11  p  p 


(3.6) 
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with  A^, . . . ,  Ap  (the  roots  of  the  characteristic  equation)  assumed  less  than  1  in 

modulus.  From  the  point  of  view  of  estimating  9,  it  is  not  clear  whether  Eq.(3.4) 

should  in  every  case  be  "prewhitened,  "  even  when  the  a.'s  are  known.  As  a  by-product 

of  our  calculations  we  will  be  able  to  investigate  this  question. 

Both  in  this  section  and  the  next  we  consider  estimator  sequences  {£  },  where 

n 

n  will  always  be  linearly  related  to  the  number  of  observations  in  Eq.  (3. 1),  of  the 
transformed  parameter  value 

£  =  cos  9  (-1  <  |  <  1). 

In  practice,  one  takes  the  arccosine  of  these  because  it  is  the  angular  frequency  9 

which  is  of  interest.  This  operation  entails  no  difficulty  from  the  standpoint  of  large 

sample  distributions.  Indeed,  suppose  the  distribution  of  the  random  variable  'Jn  (£  -£) 

is  known  to  converge  as  n  —  00  to  that  of  a  normal  random  variable  with  mean  0  and 
2 

variance  V  (£).  We  express  this  property  in  symbols  by 

^(«n"S)  ~N(0,V2(O).  (3.7) 

Let  g(-)  be  a  given  function  whose  derivative  g'(-)  is  continuous  and  nonzero  at 
the  true  parameter  point  £ .  Then  (the  "delta"  method) 

^(g«n)  -g(«))  ~  N(0,g'2(£)V2(£)). 

Suppose,  furthermore,  that  £  £  with  probability  one.  Since  the  true  parameter 

point  by  hypothesis  is  an  interior  point  of  (-1, 1)  it  follows,  with  probability  arbitrarily 
close  to  1,  that 

9  =  cos  1£  ,  (3. 8) 

n  n 

becomes  and  remains  defined  beyond  some  index  n  (which  by  Egorov's  Theorem  is 
independent  of  the  points  in  the  underlying  sample  space).  Taking  g(x)  =  cos  *x,  we 
get 
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(3.9) 


\£(0  -  9)  ~  N(0,  Q2(0))  Q2(0)  =  V  (C°S  g) 

n  .  ^ . 

sin  0 

This  function,  for  the  estimates  we  will  be  considering,  blows  up  as  9  approaches  the 
endpoints.  This  is  not  a  desirable  feature,  nor  is  it  an  unexpected  one.  It  is  difficult 
to  see  how  one  can  avoid  the  situation  without  resorting  to  more  computationally 
expensive  schemes,  such  as  those  based  on  the  Fourier  transform  of  the  appropriately 
weighted  covariance  sequence.  Section  5  contains  a  brief  discussion  of  one  of  these 
other  approaches  to  the  problem. 

*  *  *  *  *  *  * 


Our  first  method  assumes  that  the  lag  0  and  1  covariances  of  {u^}  are  known. 
Under  the  assumptions,  and  in  the  notation,  of  Theorem  4  the  ratio 


Cn(l)  -  <r(l) 

n  =  Cn(0)  -  W) 


(3. 10) 


is  a  (strongly)  consistent  estimate  of  |  =  y  (l)/y  (0)  no  matter  what  value  we  fix  for 

the  integer  c  when  computing  C^(0)  and  C^(l).  (Nothing  is  changed  asymptotically 

of  course,  when  we  start  the  summations  at  t  =  t^  rather  than  t=  1.  Furthermore,  we 

need  only  consider  c^l.)  According  to  Theorem  2,  consistency  holds  under  weaker 

noise  conditions.  However,  we  will  assume  the  restrictions  imposed  on  { u^}  by 

Theorem  4  are  satisfied,  in  the  interest  of  uniformity.  After  we  derive  the  variance 
2 

function  V  for  Eq.  (3. 10),  we  will  consider  the  problem  of  choosing  c  when  the  z's 
are  generated  as  in  the  illustrative  problems. 

Let  us  discard  some  excess  notational  baggage,  and  for  the  purposes  of  algebraic 
manipulation  write 
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>  defined  in  Theorem  4. 


(3.11) 


C,  in 
h 

place  of 

cn<h> 

°h 

tt 

Dn<h) 

Yh 

ti 

Vh> 

^ab 

tt 

4>e(  a>  t>) 

The  first  two  are  random  variables  depending  on  the  sample  size  or,  what  is  the 
essential  thing,  the  highest  data  index  , 

N  =  cn  +  h  ,  (3. 12) 


on  the  z' s.  The  second  pair  in  Eq.  (3. 11)  depends  on  the  parameter  0  under  estimation. 
Keeping  this  in  mind,  Eq.  (3. 10)  is  seen  to  be  equivalent  to 


yn  D ,  -  y,  Dn 


Dr«Do 

C0  -0(0) 


because  y^  is  £  y^  .  The  numerator  tends  to  normality  and  the  denominator  converges 
in  probability  to  y^.  By  a  well-known  theorem  in  large  sample  distribution  theory 
(Cramer,  Sec.  20.6,  Ref.  [2])Eq.  (3. 7)  follows  with 


v  «>-— #00] 


(3. 13) 


under  the  assumption  that  y^  is  strictly  positive.  Now  we  can  write,  introducing  the 
spectral  notation  in  Eq.  (2.  26)  and  indication  of  the  dependence  on  our  selection  of  c. 
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2  |  >2  i 

^ab  =  [47rp  |A(0)|  f^(0)]  cosatfcosb#  +  (p 


,(c) 

ab 


(3.14) 


Upon  substituting  into  Eq.  (3. 13)  we  find  the  first  terms  in  ip^  depend  on  £  in  such  a 
way  that  they  do  not  contribute.  Therefore 


(3. 15) 


(q) 

We  note  that  cp  must  always  be  positive,  because  it  is  the  limiting  value  of 
2  aa 

n/  Z  fa).  y_  >  0  places  restriction  on  the  filter  coefficients  &  <x  a  or  0 . 

^  n'  '0  r  0  1  p 

By  Eq.  (3.9),  now,  *Jn(d  -d)  has  a  large  sample  normal  distribution  with  mean  0 
and  variance 


n 


Q2(0) 


a2(Q 

2 

T0 


(3.16) 


when  we  set 


a2(S)  = 


(c)t2  9  „<C)t+JC) 

’’oo{  ‘  2lf’od  +  ^11 

i-l2 


<  U I  <  i) 


(3. 17) 


If  the  entire  covariance  sequence  cr(  • )  of  {u  }  is  given,  we  can  set  up  large  sample 
confidence  intervals  on  0,  which  are  free  of  unknowns,  by  consistently  estimating  the 
parameter  values  and  A.  (Weak  consistency  suffices,  although  our  estimates  are 
strong.)  Cq-  <t(0)  is  just  such  an  estimate  of  the  former.  The  numerator  in  Eq.  (3.  17) 
is  positive  (for,  as  a  matter  of  fact,  all  real  numbers  £).  When  <1,  therefore, 
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we  can  certainly  compute  the  real  positive  square  root  A(£  ).  Since  A(  •)  is  a  continuous 

n 

function  at  every  interior  point  of  (-1,  1),  it  follows  from  £  —  £  that  M!n)  M£)  with 
probability  one.  Therefore,  if  we  take  such  that 


the  interval 

ke 

Cn(0)  -o<0) 


(3. 18) 


will  have  confidence  coefficient  1-e. 

****** 


The  computations  involved  in  Method  I,  i.e. ,  Eq.  (3.  10),  are  relatively  trivial. 
Indeed,  the  basic  statistics  are  0^(0)  and  Cn(l)  which,  being  averages,  can  be 
generated  recursively.  Only  the  previous  z(obtained  c  time  units  ago)  need  be  re¬ 
tained.  The  inversion  of  the  cosine  at  each  step  to  get  the  estimate  of  frequency  is 
an  inherent  part  of  our  approach,  and  presents  no  real  computational  difficulty.  The 
price  we  pay  for  computational  rapidity  is  reflected  in  the  variance  of  the  limiting 
distribution.  A  quantitative  examination  of  this  function  is  made  below  for  the 
illustrative  problems. 

With  regard  to  determining  a  choice  for  c,  the  expression  to  be  investigated  in 

2 

any  given  situation  is  not  V  alone,  but  rather  the  large  sample  variance  of  £ 
written  in  terms  of  the  total  number  of  z -observations  used  to  achieve  it.  By  Eq.  (3. 12), 
this  is 

V2/n  =  cV2/N 


in  large  samples.  It  is  c  times  the  quadratic  in  Eq.  (3. 15),  therefore,  i.e. 
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(3.19) 


Yc)  =  ct  "  2%h  +  1 


which  we  would  like  to  minimize  in  some  sense  with  respect  to  c  ^  1.  Up  to  the 
2 

multiplier  l/y^  ,  this  is  the  loss  incurred  when  we  use  c  and  £  is  the  true  state  of 
nature.  The  coefficients  are  given  in  Eq.  (3. 14)  as  a  function  of  c  corresponding 
to  the  particular  covariances  <j(-)of{ut}.  We  now  consider  this  aspect,  among 
others,  for  the  two  illustrative  problems. 

Application  to  Problem  1.  The  coefficients  of  interest  are 


”oo  ■  2  I  "2«*> 


k=-°o 


=  2  ^  o-(ck)  a(ck+l) 
k=-°° 


(3.20) 


<P 


(c)  = 
11 


I 


[a  (ck)  +  n(ck+l) a(ck-l) ] 


k=-oo 


with  ct(  •)  given  by  Eq.  (3.3).  From  the  same  combinatorial  summation  formula  used 

to  obtain  Eq.  (3.3),  we  find  the  general  expression 

4p  4p 

2p-b+a  +  (  2p-b-a 


(1)  4  b-a 

”ab  =  °  ('1) 


(3.21) 


n 


This  is  valid  for  all  integers  b  s  a  s  0,  with  the  usual  convention  that  ^  J=  0  for 
r  <  0  or  r  >  n.  Let  us  consider  Eq.  (3.  20)  with  c=p+-2.  Since  a(  • )  vanishes  for  all 
arguments  exceeding  p  in  absolute  value,  only  the  k=0  terms  remain  and 
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2 


<P, 


<P 


<P 


£"2>  -  ** 

:  /2P 

00 

VP 

<<*2) 

■  /2p 

01 

\  P 

b 

II 

I 

/  2p 

ii 

V  p 

p-1 


2  -  2p  ^2 

+ 


p-1 


The  same  expressions  obviously  hold  for  any  larger  setting  of  c.  The  calculation  of 
closed  expressions  for  the  three  sums  in  Eq.  (3. 20)  over  2  ^  c  ^  p+1  we  leave  to  the 
reader,  and  content  ourselves  with  a  comparison  of  Eq.  (3. 19)  at  the  extreme  values. 
We  find 

L  (1)  <  L  (ph2) 

for  every  |£  |  <  1  and  p  ^  1.  In  fact,  as  p  -*■  00 , 


L^(p+2) 

L|(l) 


2p 

P 


4p 

2p 


n/d 


(see  Feller,  p.  63,  Ref.  [3]).  We  will  consider  this  sufficient  reason  for  setting  c=l. 

The  variance  of  the  limiting  normal  distribution  of  *Jn(0  -6)  for  Problem  1  using 
Method  I  with  c=l  is 


Q>) 


4a 


1  /  4p 

,4p  \2V 


2 

2  cos  6  + 


A  cos  6  +4E+ML 

2p+1_ 2p2  +  3p+l 


4o  2 
sin  F(|0)sin  6 


(3.22) 
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where  we  have  added  indication  of  the  dependence  on  the  differencing  order.  This  is 

obtained  by  using  Eqs.  (2. 10)  and  (3.  21)  in  the  general  formula  of  Eq.  (3.  15),  factoring, 

2 

and  dividing  by  sin  9  .  The  constant  multiplier  in  Eq.  (3.  22)  is  the  inverse  of  the 

2  2  2 

square  of  the  "  signal  to  noise"  ratio  |p  /a  .  The  value  of  Q  becomes  extremely 
large  with  increasing  differencing  order  over  the  lower  end  of  the  parameter  range, 
as  is  clear  from  Table  1.  For  p=0,  which  means  there  is  no  polynomial  trend  in 
Eq.  (3.2),  the  variance  function  is  proportional  to 

2  2 
2ctn  9  +  esc  9  . 

This  is  symmetric  about  7r/2,  and  takes  on  its  minimum  value  of  1  there  (and  this  was 

our  reason  for  the  placement  of  the  numerical  constant).  As  9  •—  0  the  function  increases 
2 

to  infinity  like  1/9  .  In  space  applications,  for  example,  the  trend  is  usually  taken  to 
be  parabolic,  i.e.  p=3,  so  the  curves  corresponding  to  large  p  are  mainly  of  academic 
interest.  On  the  other  hand,  at  the  upper  end  of  the  range  the  variance  initially  de¬ 
creases  with  increasing  p,  and  then,  of  course,  increases.  Thus,  if  certain  prior 
knowledge  exists  concerning  9,  it  may  pay  to  difference  the  data  considerably  more 
times  than  the  degree  of  the  unwanted  polynomial.  For  example,  if  it  is  known  that 
0  >  •  Sir  we  should  difference  the  data  5  times  no  matter  what  the  trend,  of  degree 
less  than  4.  Figure  2  shows  this  behavior  over  the  upper  half  of  the  interval.  The 
reason  for  this  is  that  differencing  tends  to  pull  large  frequencies  towards  the  center 
of  the  interval,  where  estimation  is  easier. 
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r 


p 


0/tt s 

s  0 

1 

2 

3 

4  5 

6 

7 

8 

9 

.1 

2.  942(1) 

3.614(4) 

4.992(7) 

7.  265(10) 

1.089(14)  1.665(17) 

2.  578(20) 

4.  032(23) 

6.355(26) 

1.008(30) 

.2 

6.683 

5.  515(2) 

5.043(4) 

4.  838  (6) 

4.  775  (8)  4.  799(10) 

4.  886(12) 

5.023(14) 

5.202(16) 

5.420(18) 

.3 

2.584 

4.  620(1) 

9.915(2) 

1.907  (4) 

4.057  (5)  8.777  (6) 

1.922  (8) 

4.247  (9) 

9.453(10) 

2.  116(12) 

.4 

1.317 

7.578 

5.474(1) 

4.086  (2) 

3.114  (3)  2.409  (4) 

1.  884  (5) 

1.485  (6) 

1.178  (7) 

9.  401  (7) 

.5 

1.000 

1.750 

6.125 

2.217  (1) 

8.155  (1)  3.034  (2) 

1.139  (3) 

4.  305  (3) 

1.637  (4) 

6.  250  (4) 

.6 

1.317 

5. 164(-1)  || 

9.  919(-1) 

2.117 

4.606  1.012  (1) 

2.235  (1) 

4.  966  (1) 

1.108  (2) 

2.  480  (2) 

.7 

2.584 

2. 683(-l) 

2. 209(- 1)  II 

2. 911(-1) 

4. 246(-l)  6. 372(-l) 

9.  664(-l) 

1.473 

2.254 

3.457 

.8 

6.  683 

4. 223(-l) 

1. 421(-1) 

8. 461(-2) 

6.  944(-2)  6.  818(-2)|| 

7.  335(-2) 

8.  290(-2) 

9.  610(-2) 

1. 130(-1) 

.9 

2.  942(1) 

1.814 

5. 237(-l) 

2. 302(-l) 

1.  246(-l)  7.670(-2) 

5. 158(-2) 

3.  706(-2) 

2.  807(-2) 

2.  220(-2)|| 

Table  1. 


'  -  ~7~  Q^(0)  given  by  Eq.  (3.22). 
4(7  p 


The  number  within  parentheses  is  the  exponent  of 
the  power  of  10  by  which  the  entry  is  to  be 
multiplied. 


05  0.6  0.7  0.8  09  1 


B/lr 


Fig.  2.  --  —  Q^(0)  given  by  Eq.  (3.22). 

4a  P 
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Application  to  Problem  2.  Let  us  consider  the  simplest  nontrivial  situation  in 
which  the  driving  sequence  in  Eq.  (3.  4)  obeys 

v=av  ,+w  ( I  a  I  <  1) . 

t  t-1  t  mi/ 

If  a  is  known,  we  can  take  £  =  {l,  -a}.  Substituting  Eq.  (3.  5)  into  Eq.  (3.  20)  we  get 


(°)  _ 
01 


4 

CT 


independent  of  c.  Since  we  want  to  keep  the  loss  in  Eq.  (3. 19)  small,  we  should 
therefore  take  c=l  when  computing  the  estimate  in  Eq.  (3. 10).  The  squared  modulus 
of  the  transfer  function  of  £  is 


I  A(0)  |2  = 


p  1 

2 

r  P  -i 

/  Of .  COS  jfl 

+ 

^  Of  sinj0 

Lj  j 

j=o  J 

-j=o 

=  (l-acos0)  +  a  sin  6 
2 

=  l-2£a  +  a 


in  terms  of  £  =cos0.  The  function  in  Eq.  (3. 15)  is  therefore 

„2  _  4cr4  2£2  +  1 

V  4  ’  _  2.2 

p  (l-2|a  +  a  ) 


(3.23) 


Now  suppose  we  perform  no  filtering  (as,  for  example,  when  a  is  not  known). 
The  process  u^  =  v  (assumed  to  have  been  arbitrarily  initialized  at  t  =  -00)  has  the 
covariance  sequence 

a(k)  =  a(0)a'kl  o(0)  =  (k  =  0,±l,±2, . . .). 

1-a 

The  sums  in  Eq.  (3.  20)  are  found  to  be 
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wherein 


W-  V  „2c|k| 


=  E a 

k=-°° 


(c)  „  2,m  V  c|k|  |ck+l|  0  2mx  a  -Hd 

•^oi  = 2or  <0)  Z a  a  -2<T(0)i(rbr 


k=-oo 


<P 


(c) 

11 


=  ff2(0)  ^  i 


2c|k|  lek+il  lek-lj 


k=-°° 


=  <r2(0) 


2tI~  +  a2_1 

l-D 


b  = 


2c 

a 


belongs  to  (0,  1)  no  matter  what  our  choice  of  csl,  These  combine  to  give  for 
Eq.(3.15),  since  |A(0)|=1, 


y2  =  i£_ 


(1 


2  2 

•a  )  (1-b) 


2 (I-Hj)I2  -  4-^-^- 


|  + l+3b+a  (1-b) 


(3.24) 


where  the  bar  indicates  no  prewhitening.  The  loss  in  using  c,  when  £  and  a  are 
the  true  parameter  values,  is 


(c)  =  c  V2  . 


We  can  adopt  as  a  criterion  for  choosing  c  minimizing  the  worst  that  can  happen  to 
us.  The  result  of  maximizing  over  ||  |  ^  1  is  proportional  to 


3  +  4  |a|+ a^  +  4  |a|^C  *  +  5a2c 


c 


1 


2c 

-a 


a2(c+D 
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If  a  is  known,  we  then  choose  the  integer  c=c(a)  for  which  this  ratio  is  smallest. 
One  might  believe  that 


(c) 


1  •  V 
c-  V 


would  not  exceed  1  no  matter  what  value  we  fix  for  c,  because  the  numerator  is  the 
variance  of  the  estimate  which  makes  use  of  the  assumed  knowledge  of  the  value  of  the 
nuisance  parameter  a.  This  is  erroneous,  since  for  any  fixed  integer  c  ^  1  there 
exists  a  £  and  a  (in  fact,  a  continuum  of  values)  for  which  E  >  1.  Indeed,  one  need 
only  take  £=a.  Then  the  square  bracketed  quantity  in  Eq.  (3.  24)  is  just  (1-a2)  (1-b). 
This  combines  with  Eq.  (3.  23)  to  yield 


E 

a, 


2a2  +  1 

c(l-a2) 


Given  the  value  of  c,  it  remains  to  choose  a  sufficiently  close  to  1  so  that  E  >  1. 
The  continuum  of  values  is  obtained  by  letting  £  range  through  a  sufficiently  small 
neighborhood  of  a,  and  appealing  to  continuity.  It  is  important  to  remember  that 
this  comparison  is  based  solely  on  the  particular  estimate  in  Eq.  (3. 10). 
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4.  METHOD  II 


Our  second  procedure  for  estimating  the  parameter  value  6  in  Eq.  (3. 1)  requires 

a  bit  more  computation  than  does  the  estimate  based  on  Eq.  (3. 10).  The  new  £ -estimate 

is  an  unambiguous  solution  of  a  certain  quadratic  equation  whose  coefficients  involve 

C  (2)  as  well  as  C  (0)  and  C  (1).  The  dependence  on  population  quantities  is  via  the 
n  n  n 

correlations  a(l)/cr(0)  and  ct(2)/ct(0).  Thus,  when  {u^}  is  a  prescribed  linear  opera¬ 
tion  on  white  noise  (as  e.g.  in  Problem  1),  there  is  no  need  to  know  the  common 
variance.  Furthermore,  the  coefficients  have  the  property  of  being  invariant  under 
the  replacement  of  C^(h)  by  C^(h)  -  a(h).  This  feature  greatly  facilitates  calculation 
of  the  limiting  distribution  of  the  solution. 

The  observation  which  suggests  the  procedure  is  that  the  Tchebichev  polynomials 
in  Eq.  (2. 1)  both  satisfy,  for  the  appropriate  pair  of  initial  conditions, the  same  second 
order  difference  equation;  viz. , 


f  , ,  -  2g  f  +  f  ,  =  0 
t+1  s  t  t-1 


Since  £  is  a  linear  operator,  the  sequence  of  regression  functions  in  Eq.  (2.3)  also 
obeys  this  recursion: 


g0(t+2)  -  2|g0(t+l)  +gg(t)  =  0  . 


(4.1) 


It  thus  follows  for  Eq.  (3. 1),  no  matter  what  our  choice  of  the  integer  c  ^  1,  that 


z 


(4.2) 


wherein 


(4.3) 


The  former  is  an  identity  in  £  =cos  6  relating  three  successive  observables  to 

hypothetical  random  variables.  The  covariances  of  {u*}  in  terms  of  those  of  {u  }  are 

*■  t 
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(4.4) 


<j^  (k)  =  4cr(ck)4  -  4[  a(ck+l)  +  a(ck- 1)]  4 

+  2a(ck)  +  a(ck+2)  +  a(ck-2)  (k  =  0,  ±1,  ±2,  . . . ). 


Let  2,  now,  denote  the  n*n  matrix  with  entry  o^(t-s)  in  row  s  and  column  t,  and 

let  crft  be  the  st^  entry  of  2  \  which  we  presume  exists  for  all  |4  |  <  1.  If  u*  is 
s  _x 

the  column  vector  with  components  uf,u*,  . . . ,  u*,  then  the  components  of  u**  =  2  2u* 

1  z  n  —  —  — 

are  uncorrelated.  If  we  apply  the  principle  of  Least  Squares  to  the  linear  combinations 

2  2 

of  Eq.  (4.  2),  an  estimate  of  4  is  obtained  by  minimizing  u**  +  •  •  •  +  u**  ,  i.  e. 
finding  a  root  of  the  equation 


0  =  • 


94 


n  n 

S=1  S=1 


St(z  -  24  z  ,  +  z  )(z  -24z  +z  ),  (4.5) 

4  cs+2  cs+1  cs/v  ct+2  ct+1  cr 


where  for  notational  convenience  we  are  assuming  tQ=l  in  Eq.  (3. 1).  Generally,  this 
is  highly  nonlinear  and,  in  addition,  the  inverse  matrix  changes  in  a  nontrivial  fashion 
with  sample  size. 

Nonetheless,  the  above  considerations  together  with  our  freedom  to  choose  c  as 
we  please,  suggest  a  computationally  simple  method  for  estimating  4-  First  of  all, 
the  variance  of  {u*},  which  does  not  depend  on  c,  is 


a  (0)  =  2o(0)  P  (4) 


p2(4)  =  242  -4Pi4+  i+p2 


(4.6) 


where 


Pl  =  a<l)/a(0) 


P2  =  a(2)/a(0) 


are  the  first  two  correlations  of  {u^} .  We  will  assume 


£  f  P 


i* 


i(i+P2) 


(4.7) 
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so  that  P  (|)  >  0.  This  may  or  may  not  place  restriction  on  the  unknown  parameter  in 
(-1,1).  For  example,  if  { u^}  is  any  stable  autogressive  scheme  the  discriminant  is 
always  negative,  and  Eq.  (4.  7)  amounts  to  no  restriction  at  all.  Equivalently,  we 

r 2  V 

could  assume  that  p^  ±  J p  does  not  belong  to  (-1,  1).  (Note  that  if  £ 

coincides  with  one  of  these  points,  then  u*=  0  with  probability  one  and  Eq.  (4.  2)  can 
be  solved  exactly.)  In  any  event,  let  us  imagine,  for  the  moment,  that  {u^}  is  a  moving 
average  of  some  given  order  0  ^  q  <  °°  ,  i.  e.  all  coefficients  in  Eq.  (2.  23)  are  0 
except  •  •  •  »/3  •  Then  cr(k)  =  0  for  all  |k|  s  q+1.  For  the  selection 

c  =  q+3 


we  see  that  cr  (k)  in  Eq.  (4.  4)  is  0  for  all  k  /  0  and  all  £ .  The  "errors”  in  Eq.  (4.  2) 
**  St 

are  thereby  made  white.  Since  o’  is  0  unless  s=t,  Eq.  (4.5)  reduces  to 


P2(0 


/  (z  -  2£z  +  z  )2  . 

/  /  ct+2  ct+1  ct 


t=l 


(4.8) 


We  are  going  to  use  a  solution  of  Eq.  (4.  8)  to  estimate  £  in  any  case,  i.  e.  even  when 

o-(  •)  does  not  vanish  above  some  index.  Furthermore,  we  can  and  will  leave  free  the 

2 

choice  of  c  until  we  have  calculated  the  variance  function  V  . 

Before  proceeding,  it  should  be  pointed  out  that  the  technique  known  in  numerical 
analysis  as  Prony's  method  (Hildebrand,  p.382,  Ref.  [5]),  for  approximating  the 
angular  frequency  8  from  data 


z^  =  pcos(0t-<p)  +  u^  (t=  1,  2,  . . . ,  n) 

2 

when  the  errors  are  independent  and  a  is  "small",  is  equivalent  to  solving  Eq.  (4.  8) 

2 

with  c=l  and  the  factor  1/P  (£)  deleted  (and  then  taking  the  arccosine).  This  "least 
squares"  estimate  (cf.  Eq.  (4.  2)  with  c=l)  is 
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n 


n 


r  = 


z  „  z  ,  +  >  z  ,  z 

t+2  t+1  Z,  t+1  t 

t=l  t=l 


2Zz 

t=l 


t+1 


and  when  n  is  large  it  will  statistically  behave  like  the  ratio  2C^/2Cg.  Thus,  with 
probability  one  (see  Eq.  (4. 11)  below) 


I* 


y0^  + 

r0  +  o-(O) 


2 

+  a 


«  . 


so  that  £;*  always  underestimates  £.  For  large  values  of  the  signal  to  noise  ratio 
2  2 

R  =  gp  /a  ,  the  right  side  will  be 


On  the  other  hand,  the  relative  bias  error  can  be  quite  large  when  R  is  small. 

Prony's  method,  of  course,  presumes  the  contrary.  Nonetheless,  if  the  error  variance 

is  known  (as  is  usually  the  case  in  numerical  approximations)  the  estimate  £*,  simply 

2 

modified  by  subtracting  cr  out  of  the  denominator,  converges  to  the  correct  value 

(i.  e.  Method  I).  As  we  will  see  below  in  Eq.  (4.  25),  we  get  a  consistent  ij  -estimate 
2 

without  knowing  cr  ,  or  making  any  assumption  on  R,  by  using  and  C 2  rather  than 

and  Cq. 

Let  us  return  to  Eq.  (4.  8),  and  simplify  the  notation  by  writing 


a 

t 


Zct+2 


b  = 
t 


Jct+1 


c  = 
t 


ct 


After  performing  the  differentiation,  using  Eq.  (4.  6)  and  multiplying  through  by 
4 

■P  (£)/4  t  0,  we  find  that  Eq.  (4.  8)  is  equivalent  to 
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0  =  -2(2b2)£P2(£)  +  (S(a+c)b)P2(S) 

+  tt-Pj)  [  4(2b2)|2  -  4(2(a+c)b)|  +  2(a-k:)2] 

2  n  2  2  2 
wherein  2b  is  2b,  etc.  The  leading  term  in  P  (|)  is  2|  ,  so  happily  the  cubic 
t=l  1 

term  cancels.  If  we  collect  the  coefficients  and  multiply  through  by  -l/2n,  then 

1  2pi  2  2 

0  =  [  -  2(a+c)b - 2b  ]  x 

n  n 


„  l+p„  9 

.[J_  2(a+c)  -  2b  ]x 

P1  2  ^"*"P2 
+  [  —  2(a+c)  -  -gjj—  2(a+c)b  ] 


(4.9) 


where  we  have  written  x  in  place  of  £  (the  true  parameter  value).  Now  we  have, 
using  the  abbreviations  in  Eq.  (3.11)  , 


n 


1  2 

1  2 

-  2b  , 

-2c 

n 

n 

—  2ab  , 

—  2bc 

n 

n 

—  2ac 

n 

These  are  not  algebraic  equalities,  but  rather  probabilistic  ones  valid  in  large  samples 
(which  suffices).  If  we  substitute  in  Eq.  (4.  9),  there  results 


0  =  2[C1-p1C()]x  -  [C^-P^lx  +  lp^+Cy-  (4.10) 
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The  solution  of  this  equation  corresponding  to  the  positive  square  root  is  a  (strongly) 
consistent  estimate  of  |,  as  we  now  proceed  to  show. 

Continuing  to  use  the  abbreviations  in  Eq.  (3. 11),  we  have  from  Theorem  4 

co  =co"r(0)'*ro 

CJ  -Cj-ow-yj"  r0(  (4.11) 

C;  =C2-.7(2)-Y2=y0(2{2-l) 

with  probability  one  as  n  —  00 .  Upon  replacing  by  a(h)  in  Eq.  (4. 10)  we  see  that 
all  coefficients  vanish  by  the  definition  of  p^  and  p^.  Therefore  the  solutions  of 
Eq.  (4. 10)  must  also  be  the  solutions  of 


0  =  2[  C* 


picq1 x 


1C2 


x+  [p^CJ  +  C*) -(l-*p2)Cp.  (4.12) 


Let  (resp.  |  )  denote  the  root  of  this  equation  corresponding  to  the  positive 

(resp.  negative)  square  root.  Since  the  probability  one  limit  of  a  solution  as  n-« 
is  the  corresponding  solution  of  the  limiting  equation,  it  follows  from  Eq.  (4. 11)  and 


Eq.  (4. 12)  that 


± 


where  the  numbers  and  f;  solve 

0  =  2(|-Pl)x2  -  (2£2-l-p2)x  +  [2|2Pl-(l  +p2U)  . 
The  product  of  the  roots  is  given  by 


2 €p:  -  (1+P2) 

2(1 -Pj) 


(4. 13) 


47 


Direct  substitution  shows  the  indicated  factorization  is  the  correct  one,  so  either  £ 
or  |  is  £.  Since  this  solution  does  not  depend  on  p^  or  p2>  we  can  set  them  both  to 
0  and  solve  Eq.  (4. 13)  to  resolve  the  sign  question.  We  get 


(2g  -1)+  V(2g 


l 


l)2  + 


4| 


=  I 


Returning  to  the  original  notation,  our  estimate  of  £  is  thus 

£  =  positive  square  root  solution  of 

2[Cn(l)-PlCn(0)]x2  -  [Cn(2)-p2Cn(0)]x  +[P1(Cn(0»+Cn(2)-(ltP2)Cn(l)]  =  0  . 

(4.  14) 

The  equation  can,  of  course,  be  divided  through  by  C  (0)  >0.  The  coefficients  are 
then  simpler  expressions  involving  C  (h)/C  (0)  =  R^fli)  (h=  1,  2). 

We  next  establish  Eq.  (3.  7)  for  the  estimate  in  Eq.  (4. 14),  and  evaluate  the 
variance  function.  Let  us  now  use  the  abbreviations 


bo  =  cI'picS  'o' Wo 

VC2-<>2C0  ^“Wo 

b2  “  +  C2>  -  0+P2>Cl  ^2  *  Pl%  +  T2>  -  (1+p2)Tl 


which,  of  course,  have  nothing  to  do  with  the  /?/s  in  Eq.  (2.  23).  In  the  notation  of 

Theorem  4  and  Eq.  (3.  11),  N/n( C *  -y  )  is  D  .  Thus 

h  h  h 

B0  =  ^<b0  ’  V  =  °1  "  pl°0 


Bi  ^^l  -  Pq)  d2  "  ^2D0 

B2  =  *Jn(b2  -  p2)  =  Pl(D0+D2)  -  (l+p2)D1  . 
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The  statements  that  the  estimate  and  parameter  |  respectively  satisfy  Eqs.(4. 12) 
and  (4. 13)  are 

2brt £2  -  2b A  +  b  =  0 
On  In  2 

2P0H2  -  2/3^  +  /32  =0 

after  multiplying  the  latter  by  -y^.  If  we  subtract  these  two  equations  and  multiply 
through  by  'Jn  we  find  the  equality 

[2b0(£n+£)  -b2]  ^(4n-O  =  -2B0^2  +  B^-B2.  (4.16) 

According  to  Eq.  (4. 15)  and  Theorem  4,  the  right  side  tends  to  normality  as  n-® 
with  a  variance 

lim(5(2B0^2-B1^  +B2)2  =  F4(|).  (4.  17) 

As  shown  in  the  next  paragraph,  this  is  a  polynomial  of  degree  4  in  the  unknown 
parameter  (hence  the  notation),  even  though  the  limiting  second  moments  of  the  D^'s 
involve  powers  of  £  as  high  as  4.  We  have  already  shown  that  £n  — -  |  with  probability 
one.  This,  together  with  Eqs.  (4. 11)  and  (4.6),  proves  that  the  random  variable  in 
Eq.  (4. 16)  multiplying  the  scaled  deviation  of  interest  converges  to 

4V-0i=ropW. 

It  follows  that  Eq.  (3.  7)  is  true,  and  the  variance  function 


V2(|)  = 


4 

F  (Q 

[  2£2  -  4pj£  +  l+p2]2 


(4. 18) 


is  finite  provided  Eq.  (4.  7)  holds.  (The  correlations  p ^  and  p2  are  not  to  be  confused 
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2  2 

with  the  amplitude  p  in  -y^  =  |p  |A(0)|  .)  It  remains  to  evaluate  the  numerator  by 
substituting  Eq.  (4. 15)  into  Eq.  (4. 17),  and  using  the  formula 

lim  f  D  D  =  ip  (a,  b  =  0, 1,  2) 

n  a  b  Tab 

given  in  Eq.  (3. 14).  The  reduction  entails  several  pages  of  rather  frustrating  algebraic 
manipulations,  and  we  present  only  the  highlights. 

We  first  express  Eq.  (4. 17)  as  a  sum  of  quartics  in  £  weighted  by  the  6  different 
if)' s.  We  find 

F4(£)  =  ^oo^l^4  +p\ ]+•*•  +  ^2  "2P]£  +  •  (4-  !9) 

The  manner  in  which  this  is  to  be  completed  is  clear  from  Table  2,  reading  ip's  in 
place  of  <p's.  Now  the  first  terms  in  i p  ,  ip  , . . . ,  ip ^  are  proportional  to 
1,  £,...,  (2£^_i)2,  if  we  multiply  the  corresponding  rows  of  the  table  by  these  poly¬ 
nomials  and  collect  the  coefficients  of  £*\  £**,  . . . ,  £^  we  find  that  they 
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all  vanish.  Thus,  as  was  previously  the  case  for  Eq.  (3. 15),  there  is  no  contribution 
from  f  (0)  .  and  the  formula  is  indeed  given  by  Eq.  (4. 19)  with  ip's  replaced  by  cp's 
(which  we  anticipated  in  forming  the  table).  These  latter  are  independent  of  the 
parameter,  so  working  with  the  columns  we  obtain 


F4« )  =  [  ^  -  8PJ  +  4^>]  | 4 

+  1  -4PiP2<p^)+4p2^C1)  +  4 + 


(4.  20) 


‘VoO  ‘  2P1(1+P2)  *’01>  +  ' ' '  +  P1^22  1 


(c) 

where  in  Table  2  y  =  is  computed  from  Eq.  (3. 14). 

3-13  3D  2 

One  may  obtain  a  neater  expression  by  dividing  through  by  <j  (0).  The  coefficients 
in  Eq.  (4.  20)  then  become  dimensionless  functions  of  the  correlations 


p^  =  <r(k)/<7(0)  k  =  0,±1,  ±2, . . . 


of  {ut} .  When  this  entire  sequence  is  known,  we  can  set  up  large  sample  confidence 
intervals  on  0  in  the  same  way  as  we  did  with  Method  I  in  Eq.  (3. 18).  Indeed,  it 
suffices  to  replace  Eq.  (3.17)  by 


x2(l)  = 


4 

F  (|) 


[ 2£2-  4Pj|  +l+p2]2(l-£2) 


******* 


It  is  not  too  difficult  to  compute  Eq.  (4.  20)  when  {u^}  is  a  moving  average  of  known 
order  q  if  (as  was  suggested  initially  for  motivational  reasons)  we  take  c=q+3.  Then 
the  only  contribution  in  Eq.  (3. 14)  is  from  the  k=0  term,  so  that 
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a,  b  =  0,  1,  2  . 


(4.21) 


.  Q  +  P  Pv) 

b-a  a  b 


We  get  the  coefficients  in  Eq.  (4.  20)  by  taking  the  "inner  product"  between  the 
6-vector  created  with  this  formula  and  the  respective  colums  of  Table  2.  These 
turn  out  to  be  relatively  simple  functions  of  p^  and  p2: 

F4(S)/(T2(0)  =  4(1 -p2  )£4  -  4Pl(l-p2K3 
-  (3+4p2  +  p2  -  8p2)£2 
+  (1+P2)  (1+P2  '  2p\  )  • 


We  note  that  there  is  no  linear  term. 

Comparison  with  Method  I  in  the  white  noise  case.  When  {u  }  in  Eq.  (3.  1)  is 

4  1 

white  the  formula  for  F  becomes  particularly  simple.  It  is  worthwhile  comparing 

our  two  methods  in  this  special  case.  We  see  that 


(c)  _ 

^22 


and  that  the  other  three  (p's  with  different  subscripts  are  0.  This  is  true  independent 
of  c,  so  we  want  to  use  c=l  (and  not  q+3  =  3).  Setting  p^  =  p2  =  0  in  Table  2,  we  ob¬ 
tain  from  Eq.  (4. 18)  and  Eq.  (4.  20) 


4£  4  -  3£  2  +  1 

2  2 
(24  +1) 


For  Method  I  with  c=l  we  have  from  Eq.  (3. 15) 


(4.22) 
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(4.23) 


VI  =  -V^2+1> 

T0 

2  2 

where  y^  =  |p  |  A(0)|  is  the  same  in  both  equations  and  varies  with  the  operator  £  . 
(We  have  given  this  before  in  Eq.  (3.  23)  for  a  particular  £. )  The  relative  efficiency 
of  Method  I  to  Method  II  is  the  ratio 


V 


V2 

II 


I 


4g4  -  3g2  +  1 

(2|2  +  1)3 


*  1  . 


(4.24) 


which  we  have  plotted  in  Fig.  3.  This  depends  only  on  |  ,  so  as  a  function  of  9  it  is 
symmetric  about  the  midpoint  n/2.  The  estimates  in  Eqs.  (4. 14)  and  (3. 10)  with 
these  variance  functions  are 


i 11  = 

n 


V 


J 


C2  +  SCf 


4C , 


(4.  25) 


i1- 

n 


co’a 


where  C,  again  abbreviates  C  (h).  The  additional  amount  of  computation  required  by 
h  n  2 

Method  II  is  evidently  sufficient  to  overcome  the  knowledge  of  c t  utilized  in  Method  I. 

2 

In  case  there  is  no  filtering,  i.e.  is  p  cos(0t-<p)  plus  white  noise  with  variance  <j  , 

numerical  values  of  the  variance  function  for  0^  can  be  obtained  from  Eq.  (4. 14)  and 

n 

Table  1.  Indeed,  we  have 

V2  (cos  9)  4  4 


sin  9 


4a 


in  the  notation  of  Sec.  3. 
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{ 


Fig.  3.  Efficiency  of  Method  I  relative  to  Method  II,  in  the 
white  noise  case,  given  by  Eq.  (4.  24). 
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The  problem  of  how  to  pick  c  for  Method  II  in  the  illustrative  problems  is  an 
order  of  magnitude  more  difficult  than  it  is  for  Method  I.  We  leave  this  question 
aside,  since  it  can  be  resolved  in  any  specific  situation  (in  Problem  1  by  using  Eq.  (3.  21)). 

*  *  *  *  *  * 


When  the  noise  is  a  moving  average  of  order  q,  we  can  write  down  a  simple 

estimate  of  £  which  does  not  require  knowledge  of  and  p^.  The  price  paid  is  that 

the  variance  function  has  q+2  infinites  at  the  roots  of  T  ,„(!)  =  0.  The  restriction 

q+2 

that  £  does  not  fall  in  the  neighborhood  of  one  of  these  points  is  of  a  different  nature 
and  obviously  more  severe  than  Eq.  (4.  7).  Such  an  estimate  is  a  limiting  case  of 
the  one  presented  below,  which  illustrates  the  trade-off  between  knowledge  of  noise 
statistics  and  statistical  accuracy.  It  is  included  mainly  for  academic  reasons. 

In  generalization  of  Eq.  (4. 11)  we  have,  as  a  limit  with  probability  one  as  n  —  °°, 


C*(h)  =  C  (h)  -  o<h)  -  y  (h)  =  J(0)T  (!) 
n  n  v  n 


(4.  26) 


where  we  now  denote  y  (0)  by 

V 


J(0)  =  Ap2|A(0)|2>O. 


(4.27) 


Using  the  abbreviations  in  Eq.  (3. 11),  the  limit  sequence  obeys  the  recursion 


rh+r2{Vyh-is  0 


(4.28) 


which  defines  the  Tchebichev  polynomials,  and  upon  which  we  based  Method  II. 
for  any  fixed  integer  r  s  1, 


Thus, 


(4.29) 


is  a  strongly  consistent  estimate  of  £ ,  and  it  can  be  computed  once  we  know  the  values 
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of  o-(r-l),  a(r)  and  a(r+l).  (For  r=0,  Eq.  (4.  29)  can  be  interpreted  as  Eq.  (3. 10), 

and  in  this  sense  generalizes  Method  I.)  Let  us  assume  that  y  f  0,  i.e.  that  9 

r 

does  not  coincide  with  one  of  the  r  roots  of  cosrO  =  0: 


0  t 


2i+l 

2r 


7 T 


(i  =  0,  1,  . . . ,  r-1). 


(4.30) 


Then 


*^(ln“«)  = 


^c;+l+c;.?  ^<wvi> 


2C* 

r 


2y„ 


- -  (D  +D  ) 

2y  K  r+i  r-l' 
r 

in  the  sense  of  equality  of  asymptotic  probability  distributions.  According  to 
Theorem  4,  the  variance  of  the  limiting  normal  distribution  is 


V2({)  =  -y»rtlrtl  +  2^!  _!  +  !Cr.1>r.1)- 
4y 

r 


From  Eqs.  (3.  14),  (4.27)  and  cosh0  =  T,  (£)  =  y.  /J(0)  we  have 

n  n 

fc(0)  (c) 

^ab  ^  J(0)  ^a^b +  ^ab 


But  according  to  Eq.  (4.  28) 


1  (yL  +  2r-^,y  ,+y?.,)-  (  Tr+1,V'  ^=i2- 


4y 


2  r+1  r+1  'r-l  'r-T 


2y_ 


Therefore 


2  fc(0)  2  1 

V  (0  =  8 + 


m 


f(0) 


(c)  ,  9  (c)  (c) 

^  r+1,  r+1  ^ r+1,  r-l  ^r-l.r-l 

4T^(0 


(4.31) 
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Unlike  the  situation  in  Methods  I  and  II,  the  leading  term  in  ip  ,  contributes  to  the 

ab 

variance  function  of  the  estimate. 

For  the  choice  c=l,  we  can  write  in  terms  of  the  spectral  density  f(w)  =  f^cu) 

given  in  Eq.  (2.  26).  Using  the  inversion  formula  , 


a(k)  =  2  \  coskco  f(co)da), 
J0 


we  obtain  for  Eq.  (3. 14),  after  translating  the  index  in  the  second  sum  by  b, 

OO 

^ab  =  £,  a(k^  CT(k+b-a>  +  CT(k+b+a)  ] 


k=-°o 


cn 

=  87r  \  cosawcosbcuf  (w)dw 
J0 


(4.32) 


For  the  numerator  of  the  second  term  in  Eq.  (4.31)  we  therefore  have,  using  once 
again  the  basic  recursion, 

(D  .  o  „(1)  .  (!) 

^r+1,  r+1  +  2<pr+l,  r-1  +  *r-l,  r-1 


I’ 

•J  n 


2.2. 


=  87r  ^  [ Tr+1(cos w)  +  Tr  j(coscu)]  f  (a»)dco 


=  8tt  \  [2cosca  T  (cosco)]  “f“(w)doj. 

J0  r 


2.2. 


Consequently,  in  terms  of  cosines. 


2  2  2 

2  \  cos  rucos  cof  (co)dw 

v2(5>=8,M^JL+f-^ - 5 - 

J(0)  J2(0)  cos  r0 


(4.33) 


The  special  case  in  which  u^_  =  £w^  is  of  interest.  Letting 
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denote  the  signal  to  noise  ratio,  we  have  from  Eq.  (2.28) 


J(0)  =  2ir  R  f(0). 


Thus 


V2(4),i^l 


1  + 


1  J  ( 


2  2  2 
cos  rco  cos  tof  (o>)do> 


2nR  cos2r0  cos20  f2(0) 


(4.34) 


The  ratio  involving  the  integral  will  become  smaller  as  the  spectrum  of  the  operator 
£  becomes  more  peaked  at  w=0,  as  one  would  expect. 
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5.  A  DIFFERENT  APPROACH 

At  present  there  is  considerable  interest  in  the  subject  of  statistical  spectral 
analysis.  In  particular,  there  is  the  problem  of  choosing  and  interpreting  sample 
spectra  which  are  computed  from  data  modeled  as  a  superposition  of  trigonometric 
terms  additively  disturbed  by  a  0  mean  stationary  noise  process.  Such  processes 
are  asymptotically  stationary  (cf.  (5.  2)  below),  and  are  said  to  possess  a  mixed 
spectrum.  The  time  series  with  which  we  have  been  dealing. 


=  £[pcos(0t-<p)]  +  ut  > 


is  essentially  a  case  in  point.  Using  existing  spectral  estimation  theory,  we  will 
present  in  this  section  another  method  for  estimating  the  unknown  parameter  0  <  9  <  tr. 
Although  the  technique  is  computationally  expensive,  and  not  designed  for  real  time 
usage,  we  include  it  for  comparative  reasons. 

Let  us  redefine  our  sample  covariances  by 


C  (k)  = 
n 


n-|k| 

n  ^  ZtZt+|k| 
t=l 


k  =  0,±l,  . . .  ,±(n-l) 


(5.  1) 


0 


|k|  —  n  . 


We  have  without  loss  of  generality  assumed  tg=l,  and  we  could  (if  desired)  retain  the 
integer  c  ^  1 .  According  to  Theorem  2  we  have,  in  the  notation  of  Eq.  (4.  27), 


lim^C  (k)  =  J(0)cosk0  +  cr(k) 


s  C(k) 


(5.2) 


for  every  fixed  k.  A(6)  is  the  value  of  the  transfer  function  of  the  linear  operator  £, 
and  cr(  • )  is  the  (summable)  covariance  sequence  of  {u^}  with  spectral  density  function 
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f(w)  =  fj(w)  given  in  Eq.  (2.26).  Thus, 


C(k)  =  J(0)cosk0  +  \  cos  kcu  f(w)dai 


f. 


■  f; 


cosko>dF(u>) 


where  F(  •)  is  the  nondecreasing  spectral  distribution  function  of  {z^}  consisting  of 
a  jump  at  o>=0.  of  magnitude  J(0),  plus  the  absolutely  continuous  spectral  distribution 
function  of  {u^}. 

Following  Parzen  (Ref.  [  9] ),  we  consider  the  following  function  of  the  first  n 

observations  z, , z„, . . .  z  : 

12  n 


S 

n, 


_1 

2tt 


I 


w(k/m)C  (k)cosko;  . 
n 


|k|  <m 


(5.3) 


This  is  periodic  in  co  with  period  2ir,  and  even  about  o>=0  (as  will  be  all  functions  of 
angular  frequency,  both  random  and  nonrandom).  The  integer  m=m(n),  called  the 
truncation  point,  is  to  be  chosen  (for  reasons  which  will  become  clear)  so  that 

2 

m  —  oo  m  /n  —  0 


as  n  —  °°.  The  function  w(*)  is  called  a  covariance  weight  generator,  and  it  is  assumed 
to  have  the  properties 

w(0)  =  1  w(-x)  =  w(x) 

w(x)  =  0  for  all  |  x  |  ^  1  . 


Further  conditions  will  be  imposed  on  w(.)  as  we  proceed.  The  Fourier  transform 


W 

m 


(oj)  = 


|k|< 


w(k/m)  coskea, 
m 


(5.4) 
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is  called  the  spectral  window.  It  is  said  to  be  generated  by  the  (aperiodic)  function 


W(z) 


w(x)  cos  zxdx 


because 


W  (co)  =  mW(moj) 
m 


(-00  <  z  <  °°),  (5.  5) 


(5.6) 


holds  in  the  limit  of  large  m. 

We  first  investigate  the  expectation  of  Eq.  (5.3)  for  large  sample  sizes  n.  From 
Theorem  1  it  follows  that 

n-|k| 

£cn00  =  £  ^  [g0(t)g0(t+|k|  )+  cr(k)] 
t=l 

I  k  I 


=J  (0)cosk0  +  (  1 


n 


o-(k)  +  0(l/n), 


where  the  order  term  is  independent  of  the  lag  k.  Thus,  for  fixed  w,  we  have 

f  S  (w)=4r~^  T  w(k/m)cosk0  coskw  'S  w(k/m)(^l-- — ^cKkJcoskcu 

n,  m  Z7T  /_j  lk  i_j  \  n  / 

| k |  <  m  |k|<m 

+  0(m/n) 


because  w(  • )  is  bounded.  Using  Eqs.  (2.  5)  and  (5. 4)  in  the  first  term  we  get 


tK  [W  (w-0)  +  W  (w-W)]  +  s  (u>)  +  0(m/n)  (5.7) 

v  n,  m  m  m  n 


where  s  (co)  stands  for  the  second  sum.  We  have 
n 
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f(cu)  -  s  (u>)  =  /  [  l-w(k/m)]  a(k)  coskw 

n  Alt  Lj 

I k |  <m 


~2~~  ^  |k |  a(k) coskw  +~t  1  ff<k>  coskco 
1  k  | <  m  |k  |  2: m 


(5.8) 


=  1°  +2°  +3° 


Suppose  now  that  2  k  |<j(k)  |  <  00 ,  so  that 


f"(co) 


■--ar  I 


k  cr(k)cos  kw 


k=-°° 


exists  for  all  oj.  Then  kcr(k)  is  certainly  summable,  and 


|  2°  |  =  0(l/n) 


1 3°  |  =£ — l—  ^  k2 1 cr(k)  |  =  o(l/m2)  . 


27rm 


|  k  |  —  m 


(5.9) 


Suppose,  farther,  that  the  weight  generator  is  such  that 


■±^*L-a  (0<|a|<«). 


lim 

x  —  0  x* 

Most  w's  will  have  at  least  a  local  maximum  at  0,  so  we  might  as  well  consider  a  >  0 
(although  this  is  not  necessary).  In  any  case,  there  exists  an  integer  m'  which  depends 
on  m  in  such  a  way  that  m'  —  °°,  m' /m.  —  0,  and 


|  1-  w(x)  -  ax2  |  <  — y~ 


for  all  lx  I  <  —  . 
1  1  m 


m  logm 
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We  rewrite  the  leading  term  in  Eq.  (5.8)  as 


1°  =  - —r*  ^  k^o-(k)cosk^ 


27rm 


k|<m' 


-ki 


[  1  -w(k/m)  -  a(k2/m2 )]  c(k)  cos  kco 


|k|< 


m' 


1  V  1  -w(k/m)  ,2 
+ - j  /  - 2 - 2  k  COS  kw 

27rm  ,  |  .  k  /m 


=  4°  +  5°  +  6° 


Since  |  l-w(x)|/x  is  bounded  for  all  x  by(say)  B, 


5°  |  < 


1 


27rm  logm 


I  cr(k)  |  =  o(l/m2) 


I  k  |  < 


m' 


6°  |  <  — ^  k2 1  cr(k)  |  =  o(l/m2)  . 


27rm 


m'^|k|<  m 


Finally,  with  regard  to  4°,  we  have 

lim  y  k2 <r(k) coskw  =  -af"(o>)  . 

m'  00  i,i  . 

|k|<m' 

2 

Multiplying  Eqs.  (5.  8)  -  (5. 11)  through  by  m  ,  there  results 


lim  m  [  s  (o>)  -f(a>)]  =  af"(co) 
n  n 


from  our  supposition  that  m  /n-*0.  We  therefore  have  for  Eq.  (5. 7) 
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(5. 10) 


(5.11) 


(fsn>  m<*>)  =  J(0)  hi  Wm(w-0)  +  WJw+e)  ] 

(5.  12) 

+  f(co)  +  --  ~  +  0(m/n) 
m 

2 

plus  terms  of  smaller  order  than  1/m  . 

Let  us  now  restrict  attention  to  weight  generators  w(  • )  which  are  twice 
differentiable  throughout  some  neighborhood  of  the  origin.  Then,  after  inverting 
Eq.  (5.  5)  and  doing  the  differentiations,  we  obtain 

°0 

a  =  w"(0)  =  \  V  z^W(z)dz  . 

-00 

3 

We  must  have  z  W(z)  =  0(1)  as  |z|  —  °°,  for  otherwise  the  integral  would  not  be 
finite.  There  results  from  Eqs.  (5.6)  and  (5.  12) 


fSn,m(W)  =  hW(0)Uw)m  +  m  +  o(l), 


m  =< 


hp\M0)\2 


if  w  =  e 


otherwise 


(5. 13) 


The  order  term  gives  to  0  as  the  slower  of  1/m  and  m/n  (usually  the  former). 

Consider  next  sequences  m^  and  m 2  going  to  infinity  with  n  in  such  a  way  that 
each  is  o(*Jn).  Then  it  can  be  shown  by  working  in  the  frequency  domain  (Eq.  (5. 10)  in 
Ref.  [  9] )  that 


nCov{  S  (w),  S  (o>)}  =  2ttW  (O)J(w)  f(cu)m  m 
n,  n,  A  z 


+  2 7rf  (co)  m 


im2  h  w<mi 


(5.  14) 


z)  W(m2z)dz  +  o(l). 
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In  particular,  for  the  variance  at  0  <  <tt,  we  have 

VarS  (o>)  s  27rW2(0)J(o))f(o))  — +f2(w)  C  w2(x)dx—  (5.15) 

n,  m  n  J  ^  n 

which  goes  to  0  as  n— - 

The  method  suggested  in  Ref.  [9]  for  estimating,  at  a  prescribed  o>,  the  value  of 
the  (possibly  zero)  jump  as  well  as  the  value  of  the  spectral  density  function  is  based 
on  choosing  a  small  number  of  truncation  points  m  <  m  <  m  <  m  (say).  Writing 

I  Z  J  4 

y.  for  S  (w),  a  for  g  W(0)J(co)  and  /?  for  f(o>),  Eq.  (5. 13)  gives  in  large  samples 
i  n,  m . 

x 

a  "regression  model” 

y.  =  am.  +  B  +  e. 
i  i  i 


with  correlated  "errors".  One  can  thus  consider  the  computation  of  the  Least  Squares 
estimate  of  a  and  /3;  that  is  to  say,  locating  the  minimum  of 
4  4 

X  X‘£Vv“m r«(y, (5-i6) 

i=l  j=l 


with  respect  to  the  unknown  parameter  pair.  Here  <j ^  denotes  the  ij**1  element  of 

the  inverse  of  the  matrix  of  quantities  £e^e.  derivable  from  Eq.  (5. 14). 

But  in  the  problem  as  originally  posed,  we  assume  there  exists  a  line  at  some 
unknown  6  .  We  wish  to  estimate  this  angular  frequency;  there  is  no  interest  in  the 
continuous  spectrum  of  the  disturbing  process  {u^}.  For  such  situations,  the  following 
procedure  was  introduced  as  a  method  for  estimating  the  absolute  maximum  of  a 
spectral  density  function  (Gardner,  Ref.  [4] ).  However,  it  can  be  applied  to  the 
present  situation,  and,  in  fact,  to  mixed  spectra  in  general  where  there  are  a  finite 
number  of  lines  interior  to  (0, 7r). 

Let  us  replace  m  by  m+1  in  Eq.  (5.3),  and  denote  the  coefficients  by 
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for  |k|  ^  m 


fyjOO  = 


w( 


WCn<k> 


0 


for  k  >  m 


The  weight  generator  w(  • ),  in  addition  to  the  properties  previously  cited,  is  to  have 

a  transform  in  Eq.  (5.  5)  which  is  positive  at  every  point  of  the  real  line.  Then 

Sn  m(a’)  is  nonnegative  on  (0, 7r),  because  it  can  be  written  as  a  (scaled)  convolution 

of  W(  • )  with  the  positive -valued  periodogram  of  the  data  z  ,  z  ,  . . . ,  z  .  Fix  (large) 

i  l  n 

values  of  n  and  m.  Then 


%(c o)  =  S  (w)  =  tt-  > 
CP  '  n,  mv  ’  2n 


^(k)  coskw 


|  k  |  ^  m 


has  an  absolute  maximum  at  some  point  9*: 


(5.17) 


<  *Q(0*) 


for  all  cx>  ^  9*  (the  probability  being  0  that  there  will  be  two  or  more  such  points).  We 
now  iteratively  generate  Fourier  coefficients  tfi  (  • )  for  j  =  1,  2, . . . ,  J  by  means  of 

|h|s2i"1m 

$.(k)  =  -  (|k|<2Jm).  (5.18) 

I  >>< 

|h|^2j_1m 


For  each  j,  this  is  even  in  k  and  vanishes  for  all  integer  lags  |k|  >  2‘*m.  Further¬ 
more,  the  ip.(  •)  are  the  Fourier  coefficients  of  a  nonnegative  function 


66 


*j(w) 


2?  Jj  ^j(k)C0SkCU  • 

|k|^2jm 


Using  the  convolution  formula,  it  is  not  difficult  to  see  that 


*j(«)  = 


(5. 19) 


which  is  valid  for  any  o>.  Dividing  both  sides  by  ^(0*),  we  see  that  the  ratio  tends 
to  0  as  J  —  00 ,  unless  w=0*,  in  which  case  the  limit  is  1.  Therefore 


lim  ip  (k)  =  cosk0* 

J  —  00  J 


(5.20) 


The  iteration  in  Eq.  (5. 18)  can  provide  an  arithmetic  scheme  for  approximating 
the  mode  of  the  estimate  in  Eq.  (5.3),  which  does  not  entail  its  calculation  over 
a  grid  of  frequencies.  One  way  to  devise  a  method  for  "extracting"  an  approximation 
to  9*  from  the  i/y  s,  when  J  is  finite,  is  to  consider  the  expression 


^j(k)  =  aj(k)cosk0*  +  bj(k) 


for  k  =  0,  1,  . . . ,  K.  In  view  of  Eq.  (5.  20),  coefficient  sequences  (which  of  course 
depend  on  Eq.  (5. 17))  can  be  found  such  that 


lim  bj(k)  =  1  -  lim  a^(k)  =  0  . 

For  a  proper  choice  of  K  =  o(2^)  the  approach  can  be  made  uniform  in  the  lag  variable 
k.  Suppose,  further,  that  we  are  given  a  real  number  -te(0, 1),  which  also  depends  on 
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^q(-)  and  J  but  not  on  0*.  Now  let  v  be  the  number  of  ”t  -  threshold  axis  crossings" 
in  the  series 


^j(O)  =  1,  0(1),  0(2),...,0(K). 


That  is  to  say,  v  is  the  number  of  times  the  series  goes  from  a  value  >  £(resp.  <  t) 
to  a  value  <-L(resp.  >  l).  The  ratio 


fl*  _  H 
9J  '  K 


(5.21) 


in  [  0, 7r]  then  converges  to  0*  as  J  —  ,  for  an  appropriate  definition  of  l . 

We  can  estimate  0  with  Eq.  (5.21)  because  the  random  variable  0*  =  0* 

n, 

consistent  estimate  of  0  as  n  —  00 .  In  view  of 


m 


is  a 


1 0*  -  0  |  ^  |  0*  -0*  |  +  |0*  -  0  | 


the  problem  centers  around  properly  relating  the  number  of  iterations  J  and  the  sample 
size  n  in  order  to  (statistically)  balance  the  rates  at  which  the  bounds  go  to  0.  The 
design  formulae  given  in  Ref.  [  4]  are  not  applicable  because  they  are  derived  under 
the  assumption  that  lim  f  m(Ct’)  is  sufficiently  differentiable.  If  computation  time 
is  not  a  serious  consideration,  it  appears  worthwhile  investigating  the  above  method 
for  the  line  case  in  Eq.  (5. 13). 
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